soundchanger/README.md
2023-03-12 16:14:29 +09:00

3.9 KiB

Soundchanger

A Python program to apply sound changes to words based on the notation used in historical linguistics.

This program applies sound changes to a number of words. It can be used either as a python package, or as a command line tool.

The command line tool sc can be used in the following way:

usage: sc [-h] [-C CATEGORIES] [-i] [-z ZERO_CHARACTERS] [-v] changes strings

positional arguments:
  changes               Sound change to be applied. Multiple sound changes should be separated by a space.
  strings               Word that the sound change should be applied to. Multiple words should be separated by a space.

options:
  -h, --help            show this help message and exit
  -C CATEGORIES, --categories CATEGORIES
                        Categories to be used in the sound change.
  -i, --ignore-errors   Categories to be used in the sound change.
  -z ZERO_CHARACTERS, --zero-characters ZERO_CHARACTERS
                        Characters that should be empty strings in the changed words.
  -v, --version         show program's version number and exit

Install

Run the following command to install:

pip install git+https://git.beelm.eu/patrick/soundchanger

After this, it will be callable from the command line using the sc command.

sc --help

Quick start

from soundchanger import apply


applied = apply(
    changes = ['p>ɸ/#_', 'ɸ>h/_u'],
    strings = ['pana', 'pune']
)

The variable applied will have the following values:

[
    'ɸana',
    'hune'
]

Usage

def apply(
        changes,
        strings,
        categories={},
        ignore_errors=True,
        zero_characters='∅-'):

Applies a sound change or a list of sound changes to a string or a list of given strings.

Accepts inputs of type str or list. If the input value is of type str, the output will also be of type str.

Options

  • categories (default: {})

    Which categories will be detected. For vowels it would be {'V'='aeiou'} or {'V'=['a', 'e', 'i', 'o', 'u']})

  • ignore_errors (default: True)

    If this option is set to True, any erroneous sound change will be skipped. If set to False, a ValueError will be raised instead.

  • zero_characters (default: '∅-')

    These characters will be removed in the changed words. For example, apply('h>∅', 'aha') will return 'aa', not 'a∅a'.

Description

The input needs to be in the format of sound changes as used in publications of historical linguistics.

The general structure of a sound change is

A > B / C _ D

which can be read as "A changes to B in the environment after C and before D.

A valid sound change must have at least a value for A and one > character. Thus, the sound change a> applied to the word cat would result in ct. The evironment (/C_D) is optional and can be used to specify environment-specific sound changes. Note that the hashtag symbol (#) is used for marking word boundaries (beginning and end of word). Thus, the sound change p>f/#_ applies only at the beginning of a word. The word pana would change to fana.

The input also recognizes categories, which must be specified manually. Common categories invole for example consonants (C) and vowels (V). A sound change that happens between two vowel must therefore be written in the following way:

from soundchanger import apply


applied = apply(
    changes = ['p>b/V_V'],
    strings = ['paprepup'],
    categories = {
        'V': 'aeiou'
    },
)

This would create the output

[
    'paprebup'
]

Categories can also be combined with other characters to form groups. These need to be written inside curly brackets ({ and }) and separated by a comma (,) or vertical line (|).

apply(['a>o/{#,C}_'], ['aha', 'pana'], categories={'C': 'mnptkswlj'}) results in ['oha', 'pona']

Environments can also include several characters. apply('o>u/cVc_nut#', 'coconut', categories={'V': 'aeiou'}) results in 'cocunut'.