Galician Technical Corpus (CTG)
Search help
Complex searches match regular expressions following the syntax and semantics of the regular expressions supported by POSIX 1003.2:
- byte - a word containing the string "byte" in any position, like "xigabyte", "megabyte", "bytes", "Kbytes", "terabytes"...
- [[:<:]]mega[[:alpha:]]* - a word that begins in "mega" ("[[:<:]]" means word bounding and "[[:alpha:]]" means any word character), as "mega", "megas", "megabytes", "megabit", "megalómano"...
- produc{1,2}ión[[:>:]] - a word ended in "produción" ou "producción"
- [[:<:]]xigabytes?[[:>:]] - the word "xigabyte" or the word "xigabytes"
- [[:<:]][xg]igabytes[[:>:]] - the word "xigabytes" or the word "gigabytes"
- [[:<:]][xg]igab[yi]tes?[[:>:]] - any of the following words: "xigabytes", "gigabytes", "xigabyte", "gigabyte", "xigabites", "gigabites", "xigabite", "gigabite"
- [[:<:]]a[[:alpha:]]a[[:>:]] - three-letter word that begins and ends in "a"
- [[:<:]]a[[:alpha:]]{2}a[[:>:]] - four-letter word that begins and ends in "a"
- [[:<:]]a[[:alpha:]]*a[[:>:]] - two-or-more-letter word that begins and ends in "a"
- [[:<:]]a[[:alpha:]]+a[[:>:]] - three-or-more-letter word that begins and ends in "a"
Symbols for characters
- . - any character, including blank character
- [[:alpha:]] - any letter
- [[:alnum:]] - letters and numbers
- [[:digit:]] - any digit
- [[:space:]] - contains whitespace characters like space, tab, newline, carriage return...
- [[:<:]], [[:>:]] - word boundary
- [abc] - "a" or "b" or "c"
- [^abc] - this would match a character that is not listed
- [0-9] - which would be equivalent to [0123456789]
- [a-z] - any letter between "a" and "z", that is, "a", or "b", or "c", or "d"...
- (abc|xyz) - "abc" or "xyz"
Quantifiers
- x+ (1 or more occurrences (no upper limit) of "x", that is, "x", "xx", "xxx"....)
- x? (0 or 1 occurrences of "x", that is "" or "x")
- x* (0 or more occurrences (no upper limit) of "x", that is "", "x", "xx", "xxx"....)
- x{n} (exactly n occurrences of "x")
- x{m, n} (m to n occurrences of "x")
Escaping characters
- \+ (literally "+")
- \* (literally "*")
- \. (literally ".")
- \? (literally "?")
Computational Linguistics Group (SLI) / TALG group, University of Vigo, 2006-2010
Web design and programmming: Xavier Gómez Guinovart