Export OmegaT project to an HTML table

A few years ago I wrote a script that exported the whole OmegaT to an HTML table. I used it a lot myself, and I know quite a few other people found it helpful too. The problem with the table produced by that script was that it had no way to show repeated or alternatively translated segments. I’ve rewritten the script since, but never published an announcement about that new version. Now I did a few more changes, and thought that it’s about time to fix that omission.

Download the script from SF.net or GitHub.com repos.

The script has a number of options (lines 26-42) to tweak the output table. Change them as needed and run the script.

//// Script Options
all_project = false //Set to true to export all project, only the current file will be exported
autoopen = "none" //Automatically open the table file upon creation ("folder"|"table"|"none")
skipUntran = true //Skip untranslated segments (true|false)
fillEmptTran= false //Add custom string to empty translations, i.e. where translation is INTENTIONALLY set to empty (true|false)
EmptTranTxt = "" //String to replace empty translations (in quotes, ignored if above is false)
markNonUniq = true //Add color background to non-unique segments
markAlt = true //Add color frame around segments with alternative translation
addExtraCol = true //Add column with info about uniqness and alternative translation
extraColStr = "XX" //Uniq column caption (in quotes, ignored if above is false)
uniqStr = "" //Cell mark for uniq segments
firstStr = "1" //Cell mark for the 1st occurance of a repeated segment
repStr = "+" //Cell mark for further occurances of a repeated segment
altStr = "a" //Cell mark fo alternative translation of the segmnent
defStr = "" //Cell mark fo default translation of the segmnent
markFiles   = true      //First segments in files will have a different color for the top border
markPara    = true      //First segments in paragraphs will have a different color for the top border

With these options you can export the whole project or just the current file (resultant file name will depend on the selected scope), you can have only 2 columns (source and target), or add another column which will show whether the segment is repeated or given an alternative translation, etc.

Another cool feature is that in the table you can see how the source and target text is split into paragraphs: though each segment is given its individual cell in the table, cell borders are formatted differently to show the paragraph boundaries (optional, on by default). It is quite important if the script is used for editing/proofreading purposes. It also marks the first segment of each file if the whole project is exported (optional as well).

Comments and suggestions are always welcome. The best place for requests or bug reports would be in SF.net or GitHub.com repos’ tracker sections:
But if you really want to report them here, be my guest. I also never refuse a good cup of coffee.

Happy HTML exporting!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s