bash-ini-parser/README.md

6.8 KiB

Bash INI File Parser

This is my attempt at a Bash INI File Parser. It's probably not elegant, certainly not fast, but it does implement a large set of options and features.

I started work on this parser simply because I couldn't find an existing example that wasn't just a hack, incomplete or didn't have the features I expected from a decent parser. I hope I've come up with something helpful for other people, but it's scratched a personal itch and I'll be using it in my future projects.

Features of the parser include:

  • Global properties section.
  • Unlimited custom section names to contain any number of properties.
  • Section and keys can be case sensitive, or converted to upper/lower case.
  • Line comments.
  • Duplicate key handling - duplicate keys can be handled in 2 different ways.
  • Custom bound delimiter.
  • Booleans.
  • ... and more!

Usage

The basic usage of the parser is: parse_ini [options] <INI file>. The [options] can be seen using parse_ini --help and have detailed descriptions.

The parser outputs Bash syntax associative array declarations, and array element definitions to stdout. These Bash commands can be evaled into a script to provide access to every element in the INI file. For example, using eval "$(parse_ini test.ini)" in your script would define a set of arrays whose values can be accessed in the Bash standard method, using the keys from the INI file.

The functions from the parse_ini script can be included in your own scripts to provide INI file parsing abilities.

INI File Format

The INI file format is a very loose format - there are many options and features which can be supported. I've tried to implement the widest set of features I can, but there may be functionality missing. Some features are only available by enabling them as a --option. See the output of parse_ini --help for the options.

The main features of the supported INI file format are as follows: #........1.........2.........3.........4.........5.........6.........7.........8

General File Format

  • Blank lines are ignored and can be used to separate sections/properties for easy reading.
  • After leading whitespace removal, lines beginning with # or ; are treated as comments and ignored during parsing. Comments must appear on a line on their own.
  • Escaping of shell special characters is not required.

[section] format

  • Section names must only be comprised of alphanumeric characters, plus _.-+
  • The .-+ characters in section names will be converted to _
  • Section names are case sensitive (unless --ignore-case? is used), so 'Foo' and 'foo' are different sections.
  • Whitespace is ignored before and after the section name.
  • Section names should not be quoted in any way.

Keys

  • Keys must only be comprised of alphanumeric characters, plus _.-+
  • Keys should not be quoted in any way.

Values

Values can optionally be bookmarked with single or double quotes.

  • If quotes are to be used, they must be the first and last characters of the value
  • Whitespace within the quotes is retained verbatim.
  • Backslash line continuation is supported within quotes (but leading whitespace on subsequent lines is removed). Values can be continued by use of \ in the last column.
  • Subsequent lines are subject to leading whitespace removal as normal.
  • Comments are not recognised on subsequent lines - they are treated as part of the value.

Booleans

  • no_ sets it to 0/false, else 1/true.
  • Later settings of the same key override previous ones - last one wins.

Quotes

  • Quotes are not required for section names, keys or values. However, in some cases, quotes around the value may be required; for example, when the value begins or ends with whitespace which should be retained in the value - a set of quotes (either "..." or '...') should be used around the value.
  • Quotes are not required and should not be used around section names or keys.
  • If the value is within quotes ("" or ''), any use of the same quote character (either " or ') must be backslash escaped.

http://en.wikipedia.org/wiki/INI_file:

* Provides a good explanation of the ini format - use this for docs *

* INI's have 'sections' and 'properties'. Properties have key = value format *

Case insensitivity: Case is not changed, unless option used to covert to lower/upper case.

Comments: Allow ; and # for comments. Must be on their own line.

Blank lines: Blank lines are ignored.

Escape chars: \ at the end of a line will continue it onto next (leading whitespace is removed per normal)

Ordering: GLOBAL section must be at the top, sections continue until next section or EOF.

Duplicate names: Duplicate property values overwrite previous values.

Provide an option to abort/error is duplicate is found?

Add option to merge duplicates separated by octal byte (\036 ??)

Duplicate sections are merged. Option to error if dup.

Global properties: Support. Add to a GLOBAL section?

Hierarchy: No hierarchy support. Each section is own section.

Name/value delim: Use = by default. Allow : via option?

Quoted values: Allow values to be within " and ' to keep literal formatting.

Whitespace: Whitespace around section labels and []s is removed.

Whitespace within section labels is kept / translated.

Whitespace around property names is removed.

Whitespace within property names is kept as is (spaces squashed - option to override).

Property values have whitespace between = and data removed.

Property values are kept as is (no squashing)

http://www.regular-expressions.info/posixbrackets.html

http://ajdiaz.wordpress.com/2008/02/09/bash-ini-parser/

ff9d46a550/read_ini.sh

http://tldp.org/LDP/abs/html/

Specs:

[section] Can be upper/lower/mixed case (set by options)

Can only include: '-+_. [:alnum:]'

# Any single or consecutive occurance of '-+_. ' are converted to a single _

# eg: [foo -+_. bar] becomes [foo_bar] ??

Any leading/trailing spaces/tabs between the []s and name will be removed.

TODO

  • Specific section parsing: only parse specified section(s) given on the command line (separate by commas?). For the global section, use .. For every section but global, use *.
  • Allow changing the characters accepted as comments in the INI file.
  • Allow the key/value deliminator to be more than one character.