149 lines
4.0 KiB
Markdown
149 lines
4.0 KiB
Markdown
# Soundchanger
|
|
|
|
A Python program to apply sound changes to words based on the notation used in historical linguistics.
|
|
|
|
This program applies sound changes to a number of words. It can be used either as a python package, or as a command line tool.
|
|
|
|
The command line tool `sc` can be used in the following way:
|
|
|
|
```bash
|
|
usage: sc [-h] [-C CATEGORIES] [-i] [-z ZERO_CHARACTERS] [-v] changes strings
|
|
|
|
positional arguments:
|
|
changes Sound change to be applied. Multiple sound changes should be separated by a space.
|
|
strings Word that the sound change should be applied to. Multiple words should be separated by a space.
|
|
|
|
options:
|
|
-h, --help show this help message and exit
|
|
-C CATEGORIES, --categories CATEGORIES
|
|
Categories to be used in the sound change.
|
|
-i, --ignore-errors Categories to be used in the sound change.
|
|
-z ZERO_CHARACTERS, --zero-characters ZERO_CHARACTERS
|
|
Characters that should be empty strings in the changed words.
|
|
-v, --version show program's version number and exit
|
|
```
|
|
|
|
## Install
|
|
|
|
Run the following command to install:
|
|
|
|
```
|
|
pip install git+https://git.beelm.eu/patrick/soundchanger
|
|
```
|
|
|
|
After this, it will be callable from the command line using the `sc` command.
|
|
|
|
```bash
|
|
sc --help
|
|
```
|
|
|
|
## Quick start
|
|
|
|
```python
|
|
from soundchanger import apply
|
|
|
|
|
|
applied = apply(
|
|
changes = ['p>ɸ/#_', 'ɸ>h/_u'],
|
|
strings = ['pana', 'pune']
|
|
)
|
|
```
|
|
|
|
The variable `applied` will have the following values:
|
|
|
|
```python
|
|
[
|
|
'ɸana',
|
|
'hune'
|
|
]
|
|
```
|
|
|
|
## Usage
|
|
|
|
```
|
|
def apply(
|
|
changes,
|
|
strings,
|
|
categories={},
|
|
ignore_errors=True,
|
|
zero_characters='∅-'):
|
|
```
|
|
|
|
Applies a sound change or a list of sound changes to a string or
|
|
a list of given strings.
|
|
|
|
Accepts inputs of type str or list.
|
|
If the input value is of type str, the output will also be of type str.
|
|
|
|
**Options**
|
|
|
|
- categories (default: {})
|
|
|
|
Which categories will be detected.
|
|
For vowels it would be {'V'='aeiou'} or {'V'=['a', 'e', 'i', 'o', 'u']})
|
|
|
|
- ignore_errors (default: True)
|
|
|
|
If this option is set to `True`, any erroneous sound change will be skipped.
|
|
If set to `False`, a ValueError will be raised instead.
|
|
|
|
- zero_characters (default: '∅-')
|
|
|
|
These characters will be removed in the changed words.
|
|
For example, `apply('h>∅', 'aha')` will return 'aa', not 'a∅a'.
|
|
|
|
## Description
|
|
|
|
The input needs to be in the format of sound changes as used in publications of historical linguistics.
|
|
|
|
The general structure of a sound change is
|
|
|
|
A > B / C _ D
|
|
|
|
which can be read as "A changes to B in the environment after C and before D.
|
|
|
|
A valid sound change must have at least a value for `A` and one `>` character. Thus, the sound change `a>` applied to the word `cat` would result in `ct`. The evironment (/C_D) is optional and can be used to specify environment-specific sound changes. Note that the hashtag symbol (`#`) is used for marking word boundaries (beginning and end of word). Thus, the sound change `p>f/#_` applies only at the beginning of a word. The word `pana` would change to `fana`.
|
|
|
|
The input also recognizes categories, which must be specified manually. Common categories invole for example consonants (`C`) and vowels (`V`). A sound change that happens between two vowel must therefore be written in the following way:
|
|
|
|
```python
|
|
from soundchanger import apply
|
|
|
|
|
|
applied = apply(
|
|
changes = ['p>b/V_V'],
|
|
strings = ['paprepup'],
|
|
categories = {
|
|
'V': 'aeiou'
|
|
},
|
|
)
|
|
```
|
|
|
|
This would create the output
|
|
|
|
```python
|
|
['paprebup']
|
|
```
|
|
|
|
Categories can also be combined with other characters to form groups. These need to be written inside curly brackets (`{` and `}`) and separated by a comma (`,`) or vertical line (`|`).
|
|
|
|
```python
|
|
apply(['a>o/{#,C}_'], ['aha', 'pana'], categories={'C': 'mnptkswlj'})
|
|
```
|
|
|
|
results in
|
|
```python
|
|
['oha', 'pona']
|
|
```
|
|
|
|
Environments can also include several characters.
|
|
|
|
```python
|
|
apply('o>u/cVc_nut#', 'coconut', categories={'V': 'aeiou'})
|
|
```
|
|
|
|
results in
|
|
|
|
```python
|
|
'cocunut'
|
|
``` |