Ian Goodale, MSIS

Click on a button below to display a select sample of my projects written in that programming language.


Python

Domovyk - View on GitHub

A Python package to transliterate various Cyrillic alphabets to and from the Latin alphabet using the American Library Association-Library of Congress Romanization tables. Domovyk addresses some limitations in existing transliteration packages, providing multilingual functionality, support for composite Unicode characters, and support for languages not addressed in other packages, such as Church Slavonic and Carpatho-Rusyn. This package aims to increase the accessibility of transliteration technologies for users working in these languages, focusing on use cases that require thorough and accurate transliteration.

Rozha - View on GitHub

Rozha is a Python package to simplify and streamline a number of natural language processing processes and methods for a wide variety of languages, empowering users to use NLP on both non-English and English texts.

Word Goblin - View on GitHub

WordGoblin is a simple, lightweight word finder package that returns words containing letters specified by the user. English is supported within the package, and custom dictionaries or lists of words can easily be loaded to work with other languages.

Seshata - View on GitHub

Seshata is a streamlined, console-based journal/database program written in Python. Allows for the storage of images as well as text in the SQL database (and can print stored images in the console).

Non-English Natural Language Processing - View on GitHub

A suite of GitHub repositories dedicated to performing natural language processing (NLP) tasks on Russian and French texts. The tutorial portion of the suite, linked above, contains Jupyter notebooks with walkthroughs and sample code for performing analysis on non-English texts using various NLP methods, including Python scripts to clean, analyze, and visualize text. They were designed to accompany a workshop offered by the University of Texas Libraries.

PyGallica - View on GitHub

A Python wrapper for the National Library of France's digital library platform, Gallica. My wrapper is featured on the official Gallica site, and can also be found on my GitHub account at the link above.

The Simple Web Archiver - View on GitHub

An easy-to-use, GUI web archiving tool in Python. Pages can be downloaded as files (HTML, CSS, etc) or as WARCs. Images from the pages can also be downloaded.

Europeana Full Text in Python - View on GitHub

A variety of Python scripts to assist with searching and downloading full text records via the Europeana APIs, including newspaper records. These scripts allow you to search records in Europeana, parse the JSON returned to obtain various metadata, and download full text if it's available. The code was featured on the Europeana site, and is hosted on my GitHub account at the link above.