Posted by

Kos Ivantsov (VerdaKáfo)

Posted on

January 16, 2014

Posted under

omegat, SVN, Team Project

Comments

4 Comments

Locked OmegaT Team Project (SVN)

Situation

One of the translators involved in a team translation project using a very cool OmegaT Team project feature gets a message about the project being locked.

Problem

The translator’s connectivity is rather limited and re-downloading the whole project would be very undesirable. There’s also no SVN software installed on her computer (other than OmegaT itself). Otherwise the quickest solution would be either getting OmegaT to download the project anew into an empty folder, or run “svn cleanup” on the project’s folder using one of the available SVN tools. Bad luck this time.

Solution

Since OmegaT can act as a SVN client itself (that’s what it does when a Team project is loaded), and there’s a cool scripting functionality, why not just exploit OmegaT to do the cleanup on any folder of our choice? So, here’s the script (the heading is also a link):

svn_cleanup_selected.groovy

/*
 * Perform SVN cleanup on any local SVN repository
 *
 * @author	Yu Tang
 * @author	Kos Ivatsov
 * @date	2014-01-17
 * @version	0.2
 */

import javax.swing.JFileChooser
import org.omegat.core.team.SVNRemoteRepository
import org.tmatesoft.svn.core.wc.*

def folder

if (project.isProjectLoaded()) {
	def prop = project.getProjectProperties()
	folder = new File(prop.getProjectRoot())
	}else{
	JFileChooser fc = new JFileChooser(
		dialogTitle: &amp;amp;quot;Choose SVN repository to perform cleanup&amp;amp;quot;,
		fileSelectionMode: JFileChooser.DIRECTORIES_ONLY, 
		multiSelectionEnabled: false
		)
	if(fc.showOpenDialog() != JFileChooser.APPROVE_OPTION) {
		console.println &amp;amp;quot;Canceled&amp;amp;quot;
		return
		}
	folder = new File(fc.getSelectedFile().toString())
	}

if (SVNRemoteRepository.isSVNDirectory(folder)) { 
	def clientManager = SVNClientManager.newInstance()
	clientManager.getWCClient().doCleanup(folder)
	console.println(&amp;amp;quot;Cleanup done!!&amp;amp;quot;)
	}
return

If that pesky message pops up, it should usually suffice to fire up the script (it works even if no project is currently loaded, otherwise it cleans up the current project if it happens to be a Team project), browse to the problematic project folder and get “Cleanup done!!” message in the Scripting console. After that, the project should open without problems.
After the fix there might be a problem with sync, and OmegaT may throw a message like this:

but that is usually easily fixed with Ctrl+S (Save) or F5 (Reload).

I wish you all good and stable Internet connection, responsive and responsible team members and a glitch-less work-flow.

But as of now,

Good luck

UPDATE:

The script is now bundled with OmegaT, you’ll get it along with the program. No need to download it from the links posted here.

Posted by

Kos Ivantsov (VerdaKáfo)

Posted on

January 16, 2014

Posted under

groovy, omegat, tags, TMX

Comments

16 Comments

Convert OmegaT project to XLIFF for other CAT tools

I’m back with another little script that might be pretty handy for those who need to work on the same material in different CAT tools, or for translation agencies who use OmegaT as their main CAT application but farm out the work to translators using their CAT tools of choice. As a matter of fact, the script was requested by translation agency Velior for this very reason.
When the script is invoked, it writes out a file named PROJECTNAME.xlf (PROJECTNAME is the actual name of the project, not this loudly yelled word, of course), and the file is located in script_output subfolder of the current project. It exports both translated (they get “final” state in the resultant XLF file) and untranslated segments, and for untranslated segments the source is copied to the target, and such segments get “needs-translation” state. OmegaT segmentation and tags are preserved. Tags get enveloped in <ph id=”x”> and </ph>, so that they are treated as tags in other CAT tools. Continue reading →

Posted by

Kos Ivantsov (VerdaKáfo)

Posted on

September 10, 2013

Posted under

groovy, omegat, TMX

Comments

7 Comments

Export relavant TU’s from legacy TMX files in OmegaT

Situation

You have a new project and legacy translation memory files that usually go to /tm folder of OmegaT project. You need to give this project to someone else, but you don’t want to give away all of your previous translation. Somehow you need to extract from your TMX files only those TU’s that have matches in the current project.

Problem

The problem is evident — you need OmegaT to get matches for each segment, and if they are any good, store them somewhere handy, in a separate TMX file.
The problem has been (or still is) discussed on OmegaT Yahoo Group.

Solution

Continue reading →

Posted by

Kos Ivantsov (VerdaKáfo)

Posted on

August 12, 2013

Posted under

GNU/Linux, groovy, omegat

Comments

Leave a comment

lame GUI update to the new TMX export

This is an update to the previous post about exporting new translations to a TMX.
The script doesn’t have a GUI to select date and time or to specify whether it should work globally or on selected files. This update still doesn’t have that GUI, but provides for an external program to fill that gap. In this post I’m sharing the updated groovy script and a simple bash+zenity wizard-like script for Linux that acquires necessary data.
If no such external program/script exists, the groovy script continues to work as before without any extra fuss.

write_new_trans2TMX_extGUI.groovy

/*
 * Purpose:	 Export new translations completed after the specified 
 * 	 date (line 21) either for the entire project or for the
 * 	 selected files ("select_files" must be set to 'yes' — line 27)
 * 	 to TMX file
 * #Files:	 Writes 'translated_after_<date_time>.tmx'
 * 	 in the current project's root
 * #File format:	 TMX v.1.4
 * #Details:	http:/ /wp.me / p3fHEs-6z
 *
 * @author  Kos Ivantsov
 * @date    2013-08-12
 * @version 0.3
 */

/*
 * The date should be specified as "year-month-day HOURS:minutes"
 * If not specified or specified wrongly, the script will look for
 * translations that are newer than one day. 
 */ 
def newdate = ''
/*
 * Set "select_files" to 'yes' if you want to use file selector
 * to specify files for export. If anything else is specified, the script
 * will work with the complete project.
 */ 
select_files = ''

import javax.swing.JFileChooser
import org.omegat.util.StaticUtils
import org.omegat.util.TMXReader
import static javax.swing.JOptionPane.*
import static org.omegat.util.Platform.*
def prop = project.projectProperties

if (!prop) {
	final def title = 'Export new translation'
	final def msg   = 'Please try again after you open a project.'
	showMessageDialog null, msg, title, INFORMATION_MESSAGE
	return
}

/*
 * If you want to use an external date and time selector and a window to
 * ask whether you want to select individual files, specify the whole path
 * to that program/script. It should print out date in the proper format
 * on the first line of the stout, and "yes" or anything else on the second
 * line. 
 */
def command = "/home/user/.omegat/script/new2tmx_tweak"
try {
	proc = command.execute()
	proc.waitFor()
	//console.println "${proc.in.text}"
	def lines = "${proc.in.text}".readLines()
	newdate = lines[0]
	select_files = lines[1]
	}
catch(java.io.IOException ex){
if (ex.getMessage() =~ 'error=13'){
	console.println "The program is not executable"
	}
if (ex.getMessage() =~ 'error=2'){
	console.println "The program is not found"
	}
}

try {
	newdate = new Date().parse("yyyy-MM-dd HH:mm", newdate)
	}
	catch (java.text.ParseException e) {
		newdate = new Date().minus(1)
		final def title = 'Wrong date format'
		final def msg   = """\
The date has been specified in a wrong format.
The script will work with entries exactly one day old,
i.e. changed after $newdate\
"""
		console.println msg
		showMessageDialog null, msg, title, INFORMATION_MESSAGE
		}

namedate = new Date().parse("E MMM dd H:m:s z yyyy", newdate.toString()).format("MMM-dd-yyyy_HH.mm")

def fileloc = prop.projectRoot+'translated_after_'+namedate+"${ (select_files == 'yes') ? "_select" : ''}"+'.tmx'
exportfile = new File(fileloc)

if (prop.isSentenceSegmentingEnabled())
	segmenting = TMXReader.SEG_SENTENCE
	else
	segmenting = TMXReader.SEG_PARAGRAPH

def sourceLocale = prop.getSourceLanguage().toString()
def targetLocale = prop.getTargetLanguage().toString()

exportfile.write("", 'UTF-8')
exportfile.append("<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n", 'UTF-8')
exportfile.append("<!DOCTYPE tmx SYSTEM \"tmx11.dtd\">\n", 'UTF-8')
exportfile.append("<tmx version=\"1.4\">\n", 'UTF-8')
exportfile.append(" <header\n", 'UTF-8')
exportfile.append("  creationtool=\"OmegaTScripting\"\n", 'UTF-8')
exportfile.append("  segtype=\"" + segmenting + "\"\n", 'UTF-8')
exportfile.append("  o-tmf=\"OmegaT TMX\"\n", 'UTF-8')
exportfile.append("  adminlang=\"EN-US\"\n", 'UTF-8')
exportfile.append("  srclang=\"" + sourceLocale + "\"\n", 'UTF-8')
exportfile.append("  datatype=\"plaintext\"\n", 'UTF-8')
exportfile.append(" >\n", 'UTF-8')

def hitcount = 0

if ((select_files == 'yes')) {
	srcroot = new File(prop.getSourceRoot())
	sourceroot = prop.getSourceRoot().toString() as String
	JFileChooser fc = new JFileChooser(
	currentDirectory: srcroot,
	dialogTitle: "Choose files to export",
	fileSelectionMode: JFileChooser.FILES_ONLY, 
	//the file filter must show also directories, in order to be able to look into them
	multiSelectionEnabled: true)

	if(fc.showOpenDialog() != JFileChooser.APPROVE_OPTION) {
	console.println "Canceled"
	return
	}

	if (!(fc.selectedFiles =~ sourceroot.replaceAll(/\\+/, '\\\\\\\\'))) {
		console.println "Selection outside of ${prop.getSourceRoot()} folder"
		final def title = 'Wrong file(s) selected'
		final def msg   = "Files must be in ${prop.getSourceRoot()} folder."
		showMessageDialog null, msg, title, INFORMATION_MESSAGE
		return
	}

	fc.selectedFiles.each {
		fl = "${it.toString()}" - "$sourceroot"
		exportfile.append("  <prop type=\"Filename\">" + fl + "</prop>\n", 'UTF-8')
	}
	exportfile.append(" </header>\n", 'UTF-8')
	exportfile.append("  <body>\n", 'UTF-8')

	fc.selectedFiles.each{
		fl = "${it.toString()}" - "$sourceroot" 
		files = project.projectFiles
		files.each{
			if ( "${it.filePath}" != "$fl" ) {
			println "Skipping to the next file"
			}else{
		it.entries.each {
		def info = project.getTranslationInfo(it)
		def changeId = info.changer
		def changeDate = info.changeDate
		def creationId = info.creator
		def creationDate = info.creationDate
		def alt = 'unknown'
		if (info.isTranslated()) {
			if (newdate.before(new Date(changeDate))){
				hitcount++
				source = StaticUtils.makeValidXML(it.srcText)
				target = StaticUtils.makeValidXML(info.translation)
				exportfile.append("    <tu>\n", 'UTF-8')
				exportfile.append("      <tuv xml:lang=\"" + sourceLocale + "\">\n", 'UTF-8')
				exportfile.append("        <seg>" + "$source" + "</seg>\n", 'UTF-8')
				exportfile.append("      </tuv>\n", 'UTF-8')
				exportfile.append("      <tuv xml:lang=\"" + targetLocale + "\"", 'UTF-8')
				exportfile.append(" changeid=\"${changeId ?: alt }\"", 'UTF-8')
				exportfile.append(" changedate=\"${ changeDate > 0 ? new Date(changeDate).format("yyyyMMdd'T'HHmmss'Z'") : alt }\"", 'UTF-8')
				exportfile.append(" creationid=\"${creationId ?: alt }\"", 'UTF-8')
				exportfile.append(" creationdate=\"${ creationDate > 0 ? new Date(creationDate).format("yyyyMMdd'T'HHmmss'Z'") : alt }\"", 'UTF-8')
				exportfile.append(">\n", 'UTF-8')
				exportfile.append("        <seg>" + "$target" + "</seg>\n", 'UTF-8')
				exportfile.append("      </tuv>\n", 'UTF-8')
				exportfile.append("    </tu>\n", 'UTF-8')
						}
					}
				}
			}
		}
	}
} else {
	exportfile.append(" </header>\n", 'UTF-8')
	exportfile.append("  <body>\n", 'UTF-8')
	files = project.projectFiles
		files.each {
			it.entries.each {
			def info = project.getTranslationInfo(it)
			def changeId = info.changer
			def changeDate = info.changeDate
			def creationId = info.creator
			def creationDate = info.creationDate
			def alt = 'unknown'
			if (info.isTranslated()) {
				if (newdate.before(new Date(changeDate))){
				hitcount++
				source = StaticUtils.makeValidXML(it.srcText)
				target = StaticUtils.makeValidXML(info.translation)
				exportfile.append("    <tu>\n", 'UTF-8')
				exportfile.append("      <tuv xml:lang=\"" + sourceLocale + "\">\n", 'UTF-8')
				exportfile.append("        <seg>" + "$source" + "</seg>\n", 'UTF-8')
				exportfile.append("      </tuv>\n", 'UTF-8')
				exportfile.append("      <tuv xml:lang=\"" + targetLocale + "\"", 'UTF-8')
				exportfile.append(" changeid=\"${changeId ?: alt }\"", 'UTF-8')
				exportfile.append(" changedate=\"${ changeDate > 0 ? new Date(changeDate).format("yyyyMMdd'T'HHmmss'Z'") : alt }\"", 'UTF-8')
				exportfile.append(" creationid=\"${creationId ?: alt }\"", 'UTF-8')
				exportfile.append(" creationdate=\"${ creationDate > 0 ? new Date(creationDate).format("yyyyMMdd'T'HHmmss'Z'") : alt }\"", 'UTF-8')
				exportfile.append(">\n", 'UTF-8')
				exportfile.append("        <seg>" + "$target" + "</seg>\n", 'UTF-8')
				exportfile.append("      </tuv>\n", 'UTF-8')
				exportfile.append("    </tu>\n", 'UTF-8')
					}
				}
			}
		}
}

exportfile.append("  </body>\n", 'UTF-8')
exportfile.append("</tmx>", 'UTF-8')

final def title = 'TMX file written'
final def msg   = "$hitcount TU's written to " + exportfile.toString()
console.println msg
showMessageDialog null, msg, title, INFORMATION_MESSAGE
return

The only difference compared to the previous version is lines 43-66. The external program is specified in line 50.

new2tmx_tweak

#!/bin/bash
DATE=$(zenity --calendar \
--title "Select date" \
--text="TU's newer than the selected date will be used for export" \
--date-format=%F)

HRS=({00..24})
MINS=({00..59})

HRS=$(zenity --entry --title "Select time" \
--text="Hours" \
--entry-text="${HRS[@]}" )

MINS=$(zenity --entry --title "Select time" \
--text="Minutes" \
--entry-text="${MINS[@]}" )

zenity --question --title="Select Files" \
--text="Do you want to select individual files for export?"
if [ $? == "0" ]; then 
FILESEL='yes'
else
FILESEL='no'
fi

echo "$DATE $HRS:$MINS"
echo $FILESEL

This script should be saved somewhere (in this example it’s /home/user/.omegat/script/new2tmx_tweak) and made executable (chmod +x /home/user/.omegat/script/new2tmx_tweak). Zenity should be installed for it to work. One can cook up a nicer GUI using other tools, of course, but this serves just as a quick example. I guess, similar can be done using AutoIt or AutoHotKey on MS Windows or AppleScript on OSX.
If you happen to come up with your own date and time picker for this, feel free to share in comments or link to your solution.

But as of now,

Good luck!

Posted by

Kos Ivantsov (VerdaKáfo)

Posted on

August 10, 2013

Posted under

groovy, omegat, QA, TMX

Comments

3 Comments

Export TMX for new translations

Here’s a script that lets you export translation units that are newer than a specified date. It might be useful if you get a project with a legacy TMX containing some perfect matches (that memory should be put into /tm/auto), and you want to be able to run QA tests only on new TU’s that have been translated since you’ve started to work on the project, or you want to keep your own translations for later and don’t really care for what has been translated before.

If the script is invoked without changing or specifying anything, it looks for TU’s that are one day old and works globally on the whole project. If you need to filter TU’s older or newer than that, you’ll have to specify the date on line 21.
The date should be in “yyyy-MM-dd HH:mm” (4_digit_year-2_digit_month-2_digit_day space 2_digit_hour_in_military_notation:2_digit_minute) format. If not specified properly, it will fall back to the default one-day-old value.
Beside that, the user can specify whether the script should work globally on the entire project, or only on selected files. To enable file selection, the line 27 should read:
select_files = 'yes'
Anything other than ‘yes’ means the script will work globally. Continue reading →

Posted by

Kos Ivantsov (VerdaKáfo)

Posted on

July 15, 2013

Posted under

groovy, omegat

Comments

3 Comments

Write all source segments to a file

Update: Please, download scripts from the dedicated SF.net project page where they are maintained. Scripts at the links below might be obsolete (though most likely still working).

Since we started to make OmegaT write stuff to files, let’s try to dump all source segments to one file. I’m pretty sure one can find some use for it.

write_source2file.groovy

/*
 * #Purpose:	Write all source segments to a file
 * #Files:	Writes 'allsource.txt' in the current project's root
 * 
 * @author:	Kos Ivantsov
 * @date:	2013-07-16
 * @version:	0.2
 */

/* change &quot;includefilenames&quot; to anything but 'yes' (with quotes)
 * if you don't need filenames to be included in the file */

def includefilenames = 'no'
def includerepetitions = 'no'

import static javax.swing.JOptionPane.*
import static org.omegat.util.Platform.*

// abort if a project is not opened yet
def prop = project.projectProperties
if (!prop) {
	final def title = 'Source to File'
	final def msg   = 'Please try again after you open a project.'
	showMessageDialog null, msg, title, INFORMATION_MESSAGE
	return
}

def folder = prop.projectRoot+'/script_output'
def fileloc = folder+'/allsource.txt'
writefile = new File(fileloc)
if (! (new File(folder)).exists()) {
	(new File(folder)).mkdir()
	}

writefile.write(&quot;&quot;, 'UTF-8')
def count = 0
def uniqline

if (includefilenames == 'yes') {
	files = project.projectFiles;
	for (i in 0 ..&lt; files.size())
	{
		fi = files[i];
		marker = &quot;+${'='*fi.filePath.size()}+\n&quot;
		writefile.append(&quot;$marker|$fi.filePath|\n$marker&quot;, 'UTF-8')
		for (j in 0 ..&lt; fi.entries.size())
		{
		ste = fi.entries[j];
		source = ste.getSrcText();
		writefile.append source +&quot;\n&quot;, 'UTF-8'
		count++;
		}
	}
} else {
	project.allEntries.each { ste -&gt;
	source = ste.getSrcText();
	writefile.append source+&quot;\n&quot;,'UTF-8'
	count++
		}
	console.println &quot;$count segments found in all files&quot;
	if (includerepetitions != 'yes') {
		count = 0
		uniqline = writefile.readLines().unique()
		//console.println uniqline
		writefile.write(&quot;&quot;,'UTF-8')
		uniqline.each {
		writefile.append &quot;$it\n\n&quot;,'UTF8';
		count++
				}
			}
	}

console.println count +&quot; segments written to &quot;+ writefile
final def title = 'Source to File'
final def msg   = count +&quot; segments&quot;+&quot;\n&quot;+&quot;written to \n&quot;+ writefile
showMessageDialog null, msg, title, INFORMATION_MESSAGE
return

Once the script is invoked, it’ll create a file named “allsource.txt” in the current project’s root folder, where each segment will be on a new line. It’ll contain all the segments, even the ones that are already translated, and all the repetitions. The script can either just dump all segments into the file, or write out a filename in a box like this
+====+ |file| +====+
followed by all the segments that belong to this file, and then a new filename and respective segment, and so on, or just dump all the segments in the order they appear in OmegaT without indicating what files they belong to. This behavior can be triggered by changing line 13. When it says def includefilenames = 'yes', you’ll get filenames written to the allsource.txt, but if you don’t want the filenames, change ‘yes’ to anything else or even leave it empty, making sure you have quotes, i.e. it can say def includefilenames = 'no, thanks' or even def includefilenames = '', but not def includefilenames = no (no quotes in the last example).
The way the filenames get marked is defined in lines 44, 45.
If filenames are not included, one can choose whether to include repetitions (line 14). 'yes' means “yes”, anything else, even 'yep', means “no”.

Suggestions, enhancements, bug reports, donations, postcards, invitations to a cup of coffee, feature requests, interesting translation projects with a good pay etc. are always welcome. Criticism isn’t, but will be accepted too.

But as of now,
Good luck

Posted by

Kos Ivantsov (VerdaKáfo)

Posted on

July 13, 2013

Posted under

groovy, omegat, SVN, Team Project

SVN status

Here’s a little script that checks current SVN status of various files in an OmegaT team project. May not be awfully useful, but sometimes it can help you prevent or solve SVN sync issues. It doesn’t do anything special, just shows you the status of the project’s project_save.tmx, main writable glossary, current file and the whole project folder. Continue reading →

Posted by

Kos Ivantsov (VerdaKáfo)

Posted on

June 26, 2013

Posted under

groovy, omegat, tags

Comments

Leave a comment

Stripping Tags Everywhere, Groovy Way

Every once in a while you have to deal with a match that has wrong tags. Hopefully, pretty soon OmegaT will be smart enough to deal with such matches for you, making it possible to insert a wrongly tagged match in such a way that you wouldn’t have to fix tags — they’ll get fixed on their own. But while we are not there yet, a practical workaround is to use the match tag-free and to insert proper tags wherever needed (OmegaT 3 lets you insert them one by one, and a new default shortcut for that is Ctrl+T).
In this post I share 5 groovy scripts to strip tags in different situations (headings link to pastebin.com, files can be downloaded from there):

Replacing target with match

/*
 * #Purpose: Replace current target with tag-free match 
 * #Details: http: // wp.me/p3fHEs-4W
 * 
 * @author   Kos Ivantsov
 * @date     2013-06-26
 * @version  0.1
 */

import static javax.swing.JOptionPane.*
import static org.omegat.util.Platform.*
import org.omegat.core.Core;

// abort if a project is not opened yet
def prop = project.projectProperties
if (!prop) {
  final def title = 'Replace with Match (no tags)'
  final def msg   = 'Please try again after you open a project.'
  showMessageDialog null, msg, title, INFORMATION_MESSAGE
  return
}

def match = Core.getMatcher()
def near = match.getActiveMatch()
if (near != null) {
  def matchtranslation = "$near.translation"
  matchtranslation = matchtranslation.replaceAll(/<\/?[a-z]+[0-9]* ?\/?>/, '')
  editor.replaceEditText(matchtranslation);
}

Inserting match

/*
 * #Purpose: Insert tag-free match into current target 
 * #Details: http: // wp.me/p3fHEs-4W
 * 
 * @author   Kos Ivantsov
 * @date     2013-06-26
 * @version  0.1
 */

import static javax.swing.JOptionPane.*
import static org.omegat.util.Platform.*
import org.omegat.core.Core;

// abort if a project is not opened yet
def prop = project.projectProperties
if (!prop) {
  final def title = 'Insert Match (no tags)'
  final def msg   = 'Please try again after you open a project.'
  showMessageDialog null, msg, title, INFORMATION_MESSAGE
  return
}

def match = Core.getMatcher()
def near = match.getActiveMatch()
if (near != null) {
  def matchtranslation = "$near.translation"
  matchtranslation = matchtranslation.replaceAll(/<\/?[a-z]+[0-9]* ?\/?>/, '')
  editor.insertText(matchtranslation)
}

Replacing target with source

/*
 * #Purpose: Replace current target with tag-free source 
 * #Details: http: // wp.me/p3fHEs-4W
 * 
 * @author   Kos Ivantsov
 * @date     2013-06-26
 * @version  0.1
 */

import static javax.swing.JOptionPane.*
import static org.omegat.util.Platform.*

// abort if a project is not opened yet
def prop = project.projectProperties
if (!prop) {
  final def title = 'Replace with Source (no tags)'
  final def msg   = 'Please try again after you open a project.'
  showMessageDialog null, msg, title, INFORMATION_MESSAGE
  return
}

def stext = editor.currentEntry.getSrcText().replaceAll(/<\/?[a-z]+[0-9]* ?\/?>/, '')
editor.replaceEditText(stext)

Inserting source

/*
 * #Purpose: Insert tag-free source into current target 
 * #Details: http: // wp.me/p3fHEs-4W
 * 
 * @author   Kos Ivantsov
 * @date     2013-06-26
 * @version  0.1
 */

import static javax.swing.JOptionPane.*
import static org.omegat.util.Platform.*

// abort if a project is not opened yet
def prop = project.projectProperties
if (!prop) {
  final def title = 'Insert source (no tags)'
  final def msg   = 'Please try again after you open a project.'
  showMessageDialog null, msg, title, INFORMATION_MESSAGE
  return
}

def stext = editor.currentEntry.getSrcText().replaceAll(/<\/?[a-z]+[0-9]* ?\/?>/, '')
editor.insertText(stext)

Stripping tags in target

/*
 * #Purpose: Remove tags in the current target 
 * #Details: http: // wp.me/p3fHEs-4W
 * 
 * @author   Kos Ivantsov
 * @date     2013-06-26
 * @version  0.1
 */
import static javax.swing.JOptionPane.*
import static org.omegat.util.Platform.*

// abort if a project is not opened yet
def prop = project.projectProperties
if (!prop) {
  final def title = 'Strip tags in current segment'
  final def msg   = 'Please try again after you open a project.'
  showMessageDialog null, msg, title, INFORMATION_MESSAGE
  return
}

target = editor.getCurrentTranslation()
if (target != null) {
target = target.replaceAll(/<\/?[a-z]+[0-9]* ?\/?>/, '')
}
editor.replaceEditText(target)

There are plenty of other ways to remove tags in OmegaT, some of them even posted as my recipes, but the beauty of using groovy is that scripts can be run from withing OmegaT, with its own keyboard shortcut, without needing to assign an OS shortcut to an external script/application.
As usual, inspiration for the scripts was an idea shared at OmegaT Yahoo! Group

Good luck!

Posted by

Kos Ivantsov (VerdaKáfo)

Posted on

June 23, 2013

Posted under

groovy, omegat

Comments

Leave a comment

Writing Auxilary Text Files from OmegaT

Update: Please, download scripts from the dedicated SF.net project page where they are maintained. Scripts at the links below might be obsolete (though most likely still working).

Here I’d like to share two Groovy scripts that don’t help with anything at hand in OmegaT, but write out external text files that can often be helpful in producing better quality translation.

The first script writes selected text to a file along with some context information. This can be helpful if you need to produce a list of unknown/unclear term that need to be discussed with the client, or things to be double-checked, studied, rewritten etc.

write_selection2list.groovy

/*
 * #Purpose: Write selection to a file to create a list of terms
 * #Files:   Writes 'terms_list.txt' in the current project's root
 *     the file contains selection text, segment number, segment text
 *     and filename of the selection, if selection is in the current segment,
 *     or just the text of selection and the filename, if selection
 *     is outside the current segment.
 * #Note:    When invoked without selection, it opens the file
 *     in the default text editor
 * #Details: http : / / wp.me/p3fHEs-4L
 *
 * @author   Kos Ivantsov
 * @based on scripts by Yu Tang
 * @date     2013-06-25
 * @version  0.2
 */

import static javax.swing.JOptionPane.*
import static org.omegat.util.Platform.*

// abort if a project is not opened yet
def prop = project.projectProperties
if (!prop) {
  final def title = 'Selection to List'
  final def msg   = 'Please try again after you open a project.'
  showMessageDialog null, msg, title, INFORMATION_MESSAGE
  return
}
// get segment #, source filename and the whole current segment
def srcfile = editor.currentFile
def ste = editor.currentEntry
cur_text = ste.getSrcText()
cur_seg = ste.entryNum()

// define list file

def folder = prop.projectRoot
def fileloc = folder+'/terms_list.txt'
list_file = new File(fileloc)

// create file if it doesn't exist
if (! list_file.exists()) {
	list_file.write(&quot;&quot;,'UTF-8')
	}

/* 
 * command to open the file if there's no active selection
 * if a custom (not OS default) text editor should be used,
 * it needs to be defined in the next line (edit as needed and uncomment)
 */

// def textEditor = /path to your editor/
def command
switch (osType) {
  case [OsType.WIN64, OsType.WIN32]:
    command = &quot;cmd /c start \&quot;\&quot; \&quot;$list_file\&quot;&quot;  // default
    try { command = textEditor instanceof List ? [*textEditor, list_file] : &quot;\&quot;$textEditor\&quot; \&quot;$list_file\&quot;&quot; } catch (ignore) {}
    break
  case [OsType.MAC64, OsType.MAC32]:
    command = ['open', list_file]  // default
    try { command = textEditor instanceof List ? [*textEditor, list_file] : ['open', '-a', textEditor, list_file] } catch (ignore) {}
    break
  default:  // for Linux or others
    command = ['xdg-open', list_file] // default
    try { command = textEditor instanceof List ? [*textEditor, list_file] : [textEditor, list_file] } catch (ignore) {}
    break
}

def sel_txt = editor.selectedText
if (sel_txt) {
	list_file.append &quot;${'='*10}\n $sel_txt\n&quot;,'UTF-8'
	if (cur_text =~ sel_txt) {
		list_file.append &quot;${'-'*5}\n\
filename: $srcfile\n\
segment: $cur_seg\n\
segment text: $cur_text \n\n&quot;,'UTF-8'
	}else{
		list_file.append &quot;${'-'*5}\n\
filename: $srcfile\n\
***Selection outside of current segment***\n&quot;,'UTF-8'
	}
	console.println &quot;\&quot;$sel_txt\&quot; written to $list_file&quot;	
} else {
console.println &quot;[No selection]&quot;
console.println &quot;***Opening the file in text editor***&quot;
console.println &quot;Command: $command&quot;
command.execute()
return // exit
}

The list is created in the current OmegaT project folder, file is named terms_list.txt. When the script is invoked with no selection, this file is opened in the default text editor — so that you can easily view or edit the file. When it’s invoked with some text selected in the Editor pane, the selection gets written to the file along with some context info depending on whether selection is inside or outside of the current segment.
I’d like to write wider context, but I don’t know how to get text from previous and next segment without actually going there. Any help is welcome and appreciated, as usual.

The second script writes unique untranslated segments from the complete project into a text file named untranslated.txt. This files is located in the project’s root folder, and is rewritten each time the script is invoked. Such file can be used for a number of purposes, including producing TMX with MT.

write_untranslated2file.groovy

/*
 * #Purpose: Write all unique untranslated segments to a file
 * #Files:   Writes 'untranslated.txt' in the current project's root
 * #Details: http : / / wp.me/p3fHEs-4L
 *
 * @author   Kos Ivantsov
 * @based on scripts by Yu Tang
 * @date     2013-06-25
 * @version  0.2
 */

import static javax.swing.JOptionPane.*
import static org.omegat.util.Platform.*

// abort if a project is not opened yet
def prop = project.projectProperties
if (!prop) {
  final def title = 'Untranslated to File'
  final def msg   = 'Please try again after you open a project.'
  showMessageDialog null, msg, title, INFORMATION_MESSAGE
  return
}

def folder = prop.projectRoot
def fileloc = folder+'/untranslated.txt'
writefile = new File(fileloc)

writefile.write(&quot;&quot;, 'UTF-8')
def count = 0
project.projectFiles
.each {
//console.println &quot;\n${it.filePath}&quot;
it.entries
.findAll {!project.getTranslationInfo(it).isTranslated()}
.each {count++; writefile.append &quot;${it.srcText}\n&quot;,'UTF-8'}
}

console.println &quot;\nUntranslated segments found: $count&quot;
count = 0 
def lines = writefile.readLines()
uniqline = lines.unique()
writefile.write(&quot;&quot;,'UTF-8')
uniqline.each {
writefile.append &quot;$it\n&quot;,'UTF8';
}
console.println &quot;Unique untranslated segments written to file:  $uniqline.size&quot;

If you have ideas how to improve these, feel free to share.

UPDATE:

Here’s another script that writes all source segments to a file

But as of now,
Good luck!

Posted by

Kos Ivantsov (VerdaKáfo)

Posted on

June 20, 2013

Posted under

groovy, omegat

Comments

Leave a comment

Substitute Template For Each Project

Update: Please, download scripts from the dedicated SF.net project page where they are maintained. Scripts at the links below might be obsolete (though most likely still working).

Here I have a script that reads a tab-separated file (any number of tabs between items), each line of which contains the patterns to be found in the first position, and what it should be replaced with in the second. This file MUST be named subst_template.txt (well, it can be changed in the script, so maybe such a loud “must” isn’t really needed). The first pair should start on the first line, no empty lines between the pairs, and after the final pair there should be exactly one empty line. Below you’ll find an example of such file.
The file ought to be placed in OmegaT project’s root. That is made intentionally so that one can have a unique set of substitute patterns for each project. For example, I had an English to Ukrainian Christian project where names of the Bible books needed to be translated using one particular Ukrainian Bible version (Khomenko Bible), while for another project they needed to be taken from another version (Ohiyenko Bible). While English abbreviations remained the same, Ukrainian needed to be quite different (for instance, “Jn.” was “Йо.” in one, and “Ів.” in the other). So having a separate substitute pattern file in each projects I could use just one script to get Bible references with proper abbreviations in each of them. Continue reading →

True Translation

A daily life between languages

Tag Archives: script

Locked OmegaT Team Project (SVN)

Situation

Problem

Solution

UPDATE:

Convert OmegaT project to XLIFF for other CAT tools

Export relavant TU’s from legacy TMX files in OmegaT

Situation

Problem

Solution

lame GUI update to the new TMX export

Export TMX for new translations

Write all source segments to a file

SVN status

Stripping Tags Everywhere, Groovy Way

Writing Auxilary Text Files from OmegaT

UPDATE:

Substitute Template For Each Project

Situation

Problem

Solution

UPDATE:

Share this:

Share this:

Situation

Problem

Solution

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

UPDATE:

Share this:

Share this: