Skip to content

Commit

Permalink
#11, #12: ontologia_regulam; BCP47 style extension
Browse files Browse the repository at this point in the history
  • Loading branch information
fititnt committed Dec 1, 2021
1 parent 77e669d commit b2850c0
Show file tree
Hide file tree
Showing 3 changed files with 53 additions and 17 deletions.
41 changes: 35 additions & 6 deletions docs/eng-Latn/hxltm.adoc
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,12 @@ TIP: If you perceive too many literal translations,

#TODO: define a suggested language attribute for such revieened translations (2021-11-29T21:49:00Z)#


////
- On the issue with gender on the Core-Person-Vocabulary
** https://github.com/SEMICeu/Core-Person-Vocabulary/issues/13
////

// TIP: One common symptom of literal translations is lack of context.

// Not only this, but often means target languages have either to create new terms from bad source terms so generic that are unusable for serious work.
Expand Down Expand Up @@ -248,24 +254,47 @@ _TODO: this is a draft. Needs be documented later_

=== `+__linguam__+` (implicitum)

==== `+de_linguam`
==== `+ix_de_linguam`
The language code of this column is stored as the value of an equivalent column with the name `+est_linguam`.

==== `+de_linguam_fontem`
==== `+ix_de_linguam_fontem`
The language code of this column is stored as the value of an equivalent column with the name `+est_linguam_fontem`.

==== `+de_linguam_objectivum`
==== `+ix_de_linguam_objectivum`
The language code of this column is stored as the value of an equivalent column with the name `+est_linguam_objectivum`.

==== `+est_linguam`
==== `+ix_est_linguam`
The values of each row on this column represent the code referenced on another column with attribute `+de_linguam`.

==== `+est_linguam_fontem`
==== `+ix_est_linguam_fontem`
The values of each row on this column represent the code referenced on another column with attribute `+de_linguam_fontem`.

==== `+est_linguam_objectivum`
==== `+ix_est_linguam_objectivum`
The values of each row on this column represent the code referenced on another column with attribute `+de_linguam_objectivum`.

=== `+ib_*`
* BCP47
* https://tools.ietf.org/rfc/bcp/bcp47


=== `+ib_h_*`
* BCP 47 Extension H - Use on HXLTM
* https://hxltm.etica.ai/

=== `+ib_t_*`
* BCP 47 Extension T - Transformed Content
* https://datatracker.ietf.org/doc/html/rfc6497


=== `+ib_u_*`
* Unicode Extensions for BCP 47
* https://cldr.unicode.org/index/bcp47-extension


=== `+ib_x_*`
* BCP47 Private Use Subtags
* https://www.rfc-editor.org/rfc/rfc4646#section-2.2.7

==== Base tags used when HXLTM on XML-like container

NOTE: this section does not include other formalized specifications
Expand Down
9 changes: 7 additions & 2 deletions docs/ontologia-regulam.html
Original file line number Diff line number Diff line change
Expand Up @@ -79,14 +79,19 @@
// var regula2 = '(?<linguam>(i.?_).*(?<etc>(\\+.?))).?'
// var regula2 = '(?<linguam>(\\+i.?_).*(?<etc>(.?))).?'
// var regula2 = '(?<linguam>(\\+i.?_).*(?<etc>[^\\+i](.?))).?'
var regula2 = '(?<linguam>(\\+i.?_).*(?<etc>[^\\+i.?](.?))).?'
// var regula2 = '(?<linguam>(\\+i.?_).*(?<etc>[^\\+i.?](.?))).?'
// var regula2 = '(?<linguam>(\\+i.?_.*))?(?<etcetera>[\\+[^i]].*)?'
// var regula3 = '(?<etcetera>(\\+[^i\w?_].*))'
var regula3 = '(?<linguam>(\\+i.?_.*))?(?<etcetera>(\\+[^i\w?_].*))'
var subspeciem = [
'+i_pt+i_por+ig_port1283+is_latn+rem',
'+i_pt+i_por+ig_port1283+is_latn+exemplo1+exemplo2+exemplo3',
'+ix_de_linguam',
'+ix_est_linguam',
'+ix_est_linguam_fontem',
]
var regula_regex2 = new RegExp(regula2, "i")
// var regula_regex2 = new RegExp(regula2, "i")
var regula_regex2 = new RegExp(regula3, "i")

subspeciem.forEach(element => {
console.log(regula_regex2.exec(element))
Expand Down
20 changes: 11 additions & 9 deletions ontologia/cor.hxltm.215.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3008,27 +3008,29 @@ ontologia_regulam:
exemplum:
hxl_caput:
- hxl: '#item+conceptum+codicem'
divisionem: item
classem: conceptum
speciem: +codicem
divisionem: '#item'
classem: '+conceptum'
speciem: '+codicem'
- hxl: '#meta+linguam+i_pt+i_por+ig_port1283+is_latn'
divisionem: meta
classem: linguam
divisionem: '#meta'
classem: '+linguam'
speciem: +i_pt+i_por+ig_port1283+is_latn

# Trivia: strūctūram, https://en.wiktionary.org/wiki/structura#Latin
structuram:
# basim -> divisionem, classem, speciem
basim:
# https://regex101.com/r/XUOncM/5
javascript: >-
\#(?<divisionem>(item|meta)).+?(?<classem>(conceptum|linguam|terminum))(?<speciem>.*)
(?<divisionem>(#item|#meta))(?<classem>(\+conceptum|\+linguam|\+terminum))((?<linguam_de>(\+ix_de_[a-z_]*))|(?<linguam_est>(\+ix_est_[a-z_]*))|(?<linguam_i2a>(\+i_\w\w))?(?<linguam_i3a>(\+i_\w\w\w))(?<linguam_ig>(\+ig_\w\w\w\w\d\d\d\d))?((?<linguam_s4a>(\+is_\w{3,4})))(?<linguam_it>(\+it_[a-z0-9_]*))?)?(?<etcetera>(\+.*))?(?<datum_vocabularium>(\+v_[a-z_]*))?
# \#(?<divisionem>(item|meta)).+?(?<classem>(conceptum|linguam|terminum))(?<speciem>.*)
python: >-
\#(?P<divisionem>(item|meta)).+?(?P<classem>(conceptum|linguam|terminum))(?P<speciem>.*)
\(?P<divisionem>(#item|#meta)).+?(?P<classem>(conceptum|linguam|terminum))(?P<speciem>.*)
subspeciem:
javascript: >-
\#(?<divisionem>(item|meta)).+?(?<classem>(conceptum|linguam|terminum))(?<speciem>.*)
\(?<divisionem>(#item|#meta)).+?(?<classem>(conceptum|linguam|terminum))(?<speciem>.*)
python: >-
\#(?P<divisionem>(item|meta)).+?(?P<classem>(conceptum|linguam|terminum))(?P<speciem>.*)
\(?P<divisionem>(#item|#meta)).+?(?P<classem>(conceptum|linguam|terminum))(?P<speciem>.*)
# named group:
# (?P<hxltag>\#[a-zA-Z_]*)(?P<hxlattrs>\+\w*){0,20}
Expand Down

0 comments on commit b2850c0

Please sign in to comment.