Unified Tokenizer startup script for OmegaT

Цей допис українською.


Situation

So, you downloaded and installed the Tokenizers to make OmegaT’s search for matches and glossary terms more effective. In the Readme file it says that one can create several scripts for the languages they translate from (and for which there are tokenizers), and there are even some examples. So, that’s what you did, and now everything works, but you long for a deeper life harmony and lesser hassle.

Problem

What you need is a possibility to choose which language you are going to translate from and which tokenizer you are going to use, having only one single script, and thus only one launcher on your Desktop/Panel/Start Menu or one command somewhere on the $PATH.

Solution

A solution was found. For it to work under GNU/Linux, you must have bash, zenity, and cat.
Below you’ll find a script that must be placed in $HOME/bin or /usr/local/bin and named as deems convenient (for example ottok — OmegaTTOKenizers) Select the Source Language

#!/bin/bash
if [ ! -f $HOME/.omegat/ompath ]; then
	dirname `zenity --file-selection --title="Navigate to OT directory" \
	--text="Select OmegaT.jar from the folder where OmegaT is installed"` \
	> $HOME/.omegat/ompath
fi
OM_PATH=`cat $HOME/.omegat/ompath`
#Select the Tokenizer for your project
TokLang=`zenity --list --title="Source Language" \
--text="Select source language for the project" \
--radiolist --window-icon="$OM_PATH/OmegaT.png" \
--height="720" \
--column="" --column="Language" \
"" "Arabic" \
"" "Brazilian" \
"" "Chinese" \
"" "CJK" \
"" "Czech" \
"" "Danish" \
"" "Dutch" \
"" "English" \
"" "Finnish" \
"" "French" \
"" "German" \
"" "German2" \
"" "Greek" \
"" "Hungarian" \
"" "Italian" \
"" "Norwegian" \
"" "Persian" \
"" "Porter" \
"" "Portuguese" \
"" "Romanian" \
"" "Russian" \
"" "SmartChinese" \
"" "Spanish" \
"" "Swedish" \
"" "Thai" \
"" "Turkish" \
"" "Japanese" \
`
if [ $? -eq "1" ]; then
zenity --error --title="Canceled" --text="Canceled" --timeout=1
exit 0
fi
case "$TokLang" in
	Dutch|French|German|Russian)
		Analizer=`zenity --list --title="Analizer" \
 --text="Select the analizer for your language" --radiolist \
 --window-icon="$OM_PATH/OmegaT.png" \
 --column="" --column="Analyzer" "TRUE" "Snowball" "" "Lucene"`
	if [ $? -eq "1" ]; then
	zenity --error --title="Canceled" --text="Canceled" --timeout=1
	exit 0
	fi
	;;
	Japanese)
	Analizer="TinySegmenter"
	;;
	Arabic|Brazilian|Chinese|CJK|Czech|Greek|Persian|SmartChinese|Thai)
	Analizer="Lucene"
	;;
	Danish|English|Finnish|German2|Hungarian|Italian|Norwegian|Porter|\
	Portuguese|Romanian|Spanish|Swedish|Turkish)
	Analizer="Snowball"
esac
TOKENIZER=org.omegat.plugins.tokenizer."$Analizer""$TokLang"Tokenizer;
cd $OM_PATH
./OmegaT --ITokenizer=$TOKENIZER

UPDATE:
Updated version of the script for the new version of the plugin is here


Good luck!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s