Author Topic: [10000] Rip the complete English, Japanese, and French scripts of CT DS  (Read 36280 times)

utunnels

  • Guru of Reason Emeritus
  • Zurvan Surfer (+2500)
  • *
  • Posts: 2797
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #15 on: February 03, 2011, 10:19:44 pm »
Actually the font files are /msg/big/msgcmn.fnt and /msg/small/msgcmn.fnt, if you have a tool like Tahaxan that can explore NDS files.
 :wink:
Well, just in case you didn't know.

http://www.romhacking.net/docs/435/
« Last Edit: February 03, 2011, 10:24:07 pm by utunnels »

Vehek

  • Errare Explorer (+1500)
  • *
  • Posts: 1761
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #16 on: February 04, 2011, 06:43:34 pm »
Here's the table file. Because it still has the English characters and other things, if you try to use it as is with that translation tool, the tool will fail to show the English script due to it rejecting duplicate table entries.

alfadorredux

  • Entity
  • Mystical Knight (+700)
  • *
  • Posts: 746
  • Just a purple cat
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #17 on: February 04, 2011, 08:36:24 pm »
Well, I've gotten this far with ZeaLitY's original sample text--

[@ITEM 6 (JP) ]
{36}{15}{C2}{AA}{33}{53}{2E}{18}{30}{C3}{8D}{33}{40}{66}{1E}
{6E}{3F}{59}{36}{15}{C3}{95}{C7}{9A}{32}{C3}{9E}{C6}{80}{C2}{91}{2E}
{70}{C2}{99}{1E}{7D}{38}{48}{24}{69}{30}{C3}{83}{35}{3C}{5A}{30}{1B}\0

maps to

[@ITEM 6 (JP) ]
この??ンイにいる??ンスター
つまりこの????は??????に
ダ??ージをあたえる??ができる。\0

I just need to fix the handling of two-byte characters, and then we'll need to complete that table.

alfadorredux

  • Entity
  • Mystical Knight (+700)
  • *
  • Posts: 746
  • Just a purple cat
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #18 on: February 04, 2011, 08:55:51 pm »
Okay, not my biggest PEBCAK of all time, but it may be able to compete for runner-up. 2-byte glyphs now work, so we have:

[@ITEM 6 (JP) ]
このハンイにいるモンスター
つまりこの場合は2ヒキに
ダメージをあたえる?ができる。\0

Or something like that (what I can make out looks sensical enough, although I can only read the hiragana). The question mark is a glyph that's missing from the table, C383.

I'll try to mark up the image of the font to indicate the unID'd glyphs in the morning. And I'll attach my excuse for a program then, for what it's worth--because of a workaround I had to do, chances are that it'll only work on Linux right now.
« Last Edit: February 04, 2011, 09:06:49 pm by alfadorredux »

utunnels

  • Guru of Reason Emeritus
  • Zurvan Surfer (+2500)
  • *
  • Posts: 2797
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #19 on: February 05, 2011, 01:41:12 am »
It seems this picture matches Vehek's table, somehow.

alfadorredux

  • Entity
  • Mystical Knight (+700)
  • *
  • Posts: 746
  • Just a purple cat
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #20 on: February 05, 2011, 03:02:32 pm »
Yes, that would be the one from RHDN. Here's the annotated version:



Green glyphs are the ones in Vehek's current table (he's gotten about a third of them). White glyphs are the ones we need to add to the table before we can sort out the Japanese-language script. Yellow glyphs are the ones I'm reserving to work on myself (mostly accented Roman characters, plus some symbols and a few kanji). The blue letters and numbers along the edges are indices--write as the row first, then the column. Frex, the index of the lower-case d in the upper right corner is 0F.

What we need to do now is match up the indices with the glyph characters. You do not need to know any Japanese to help with this--a willingness to match leetle tiny pictures with each other is sufficient. http://www.rikai.com/library/kanjitables/kanji_codes.unicode.shtml has a set of kanji characters if you're not able to type them (you may need the Shift-JIS table linked at the top of that page for a few of the odder things in the font, like the circled numbers). It may be easier to see details if you magnify the image to 200% of its original size (or not). There are still a lot of easily-identifiable glyphs (symbols, half-width kana, and simple kanji) that are not in the table.

Table entries should look like:
08=s
4C=す
C2A8=来

I can't do this alone, people--Vehek's been working on it for two years and has only gotten ~500 of ~1500 glyphs. Please pitch in even if you can only match one or two. Post your entries here so that I can add them to the table.

(ZeaLitY? Your thoughts on offering a small bounty for individual glyphs?)



Anyway, as I see it, there are four steps that need to be accomplished in order to finalize the non-English versions of the CT DS script (feel free to contradict me, everyone):
1. ID the remaining characters in the font
2. Simultaneously, someone uses the utility Acacia Sgt found to extract the script
3. I take the output of 1 and 2 and stuff it through my silly little program to get the Japanese characters
4. ZeaLitY then takes the output of 3 and does the spreadsheet thing

In case anyone's genuinely curious, I've attached the program (rather a grandiose name for <50 lines of code, half of which is boilerplate) that I used to convert the sample text. Requires a Perl interpreter. Also requires Vehek's table file in the same directory. May require that line 8 ("chop($_);") be deleted in order to function under Windows. Usage: perl mapds.pl [file] (with the output being sent to [file].jptxt).

ZeaLitY

  • Entity
  • End of Timer (+10000)
  • *
  • Posts: 10797
  • Spring Breeze Dancin'
    • View Profile
    • My Compendium Staff Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #21 on: February 05, 2011, 04:24:15 pm »
Something like 10 per glyph?

Hah, this is exciting. And the spreadsheet columns can always be copied and pasted into individual script text files if necessary.

Kodokami

  • Entity
  • Dimension Crosser (+1000)
  • *
  • Posts: 1110
  • Enjoy the moment!
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #22 on: February 05, 2011, 06:10:46 pm »
Oh man, oh man. That is difficult to read. No chance of higher resolution, I'm guessing?
-----------
C3AA=山
C482=上
C48E=反
C4B6=臣
C6A6=工
CD84=穴
D082=矢
-----------
C883=失
C8A2=止
CA86=平


I'll add more as I go. (bold=new)
10 down, around 1000 more to go! 8)
« Last Edit: February 06, 2011, 01:06:02 am by Kodokami »

ZeaLitY

  • Entity
  • End of Timer (+10000)
  • *
  • Posts: 10797
  • Spring Breeze Dancin'
    • View Profile
    • My Compendium Staff Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #23 on: February 05, 2011, 07:02:37 pm »
The Compendium forums naturally resize the image to be a little smaller. Try opening it directly in a separate window; makes it marginally easier. (You probably already tried this, though...)

alfadorredux

  • Entity
  • Mystical Knight (+700)
  • *
  • Posts: 746
  • Just a purple cat
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #24 on: February 05, 2011, 07:25:09 pm »
@Kodokami: Thanks for those. Unfortunately, I didn't create the original font image (and my guess is that the characters were never larger than the 8pt they seem to be at now). I can resize it, but the image would be of the same quality as if you just downloaded it, slapped it in Photoshop/the GIMP/whatever, and set the zoom to 200%, so I don't think it's worth it.

@ZeaLitY: Yeah, 5-10 is about what I was thinking. I take it you're okay with proceeding this way, then.

ZeaLitY

  • Entity
  • End of Timer (+10000)
  • *
  • Posts: 10797
  • Spring Breeze Dancin'
    • View Profile
    • My Compendium Staff Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #25 on: February 05, 2011, 09:52:34 pm »
Yeah. We'll inch our way towards glory. This damn script will yield its secrets! And we'll have the Dream Devourer scene to retranslate, per Arc Impulse's discovery of subtler nuances left out of the Jeremy Harmer translation. (Well, nothing earth-shattering; just subtleties...)

utunnels

  • Guru of Reason Emeritus
  • Zurvan Surfer (+2500)
  • *
  • Posts: 2797
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #26 on: February 05, 2011, 10:36:17 pm »
Should I type those kanjis? I think I'm at least not bad at reading them.
 :evil:

Kodokami

  • Entity
  • Dimension Crosser (+1000)
  • *
  • Posts: 1110
  • Enjoy the moment!
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #27 on: February 06, 2011, 01:10:13 am »
By the way, if it helps anyone who's working on this, I've been using this website to find the right kanji. It's still a pain, but significantly easier when the search is narrowed down.

utunnels

  • Guru of Reason Emeritus
  • Zurvan Surfer (+2500)
  • *
  • Posts: 2797
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #28 on: February 06, 2011, 01:27:23 am »
These are what I've done so far (still working on the rest).

{ } means it is an empty slot in the picture.
red means previous character is different from Vehek's, or I'm not quite sure.
?? means the character is not quite readable to me.


{ }e{ }toanrsi{ }hl{ }ud
m.cygのwfい…p.b:ー「
なk!!たv,、て'かっIしにS
るTはンうがこんをMれらで?とま
スAもC-Dだ?あPWくすッBお
ルHトイ{ }ラLよOりきさGRアY
EそF{ }わけタN・え0どシドつリ
ダマク1ちみやゃじねガzジデ2
Kレ人プサムロカテせ王3ネミろィ
めキナオ魔様コバ大メば4達ウ1フ
エょボ時j0:力来者ハ前行xVグ
私ノ出ョ何ォャュq6撃U中戦ぞ手
城入5事(ベ『ヴ)チ』体パモ技ビ
間ワ気9ブ場ずツ秘長~村見物2セ
代屋8ゴPへ黒ほ世Q山部/火竜げ
一ゲぶ7宝生回ごケJ—殿ゆ所用ニ
べ今上ソヤ持下女言3思夢ポ法反石
最"地ぬギ強元死)復_ェ(ぜ子ズ
天びH全年勇むひザ方敵ペ水Z家
道当使底命光臣箱ぎ原防ヨ森ふ俺冥
色攻恐ピヘ修剣ホ本現4国空海理広
金变後通{Flame}士作未ざ次+待町+ぐ始
話不M号戻伝.動無老跡連5食知
{Dark}性分度開先聖Xぼ自虹6器層闘」
ヒ誰9化日調*心吸古窟外R域御取
里単ゅヌ北民封刑鳥{Heaven}落橋定開能発
墟発名守星倒工目{Droplet}逃帰果装備助成
少賢炎小画込巨G近早太{Heart}軍味説残
洞明以專{Star}電切界妃々印D高賴
終属務放仕議ゼヲ{note}路合眠宿探暗木
A口黄好直ゥ收立期遠刀負面解携掛
院団返兵教消勝悪除初置必完換砂感
奥頭身失突段乱%新交陽{small note}具闇料階
昔特呼貝願秘決仲会向記男殺Sづ品
父西止飲予宮壊設引友兄続信斬&判
楽同輝風姿要源腕素録我巣売別安
常育二漠E意進受千裁起武東破内V
半配旅他裏番7急計F{ }Y実扉母主
飛族利混弱白超赤対相機親形線足
確認転岩挑深丈O望究活了供鉄·X
TB追館夫礼平在可救永登異乗険考
万師氷危夜船L押付音表覚血酋効嘆
墓庫鐮C去運声ゾ態着左員塔鎧射差
毒休研術的和情罪三8閉N過渡然酒
右緑存影製証并走数買ぽ愛応更
加幼依得雷姉翼葉示役歴重係君??
ぴx正報岬邪??視卵殻弓怪投操
伏書悲種捕等查材絶角弾円娘憶字陸
踊処野耐監室排銃倍祖x歌I鐘建精
呪ぁ率限念車像弟鋼球侵美拳僕彼
多虫流敗珠薬両暴再械敷警幻辺滅輪
姫波速ぷ苦章削{ }南検緒移捜史病??
迷罠亡獄穴管演銀羽住紋郎値約震希
十型令許良島%頂客狩虜花張片戒鏡
質送→傷非泣困結ユ肉参極降爆寄却
隠草熱似牙冒干亭牢>廷èé銅[]
奇殴湿拔殊授樹想読位冷造忘格脱
折央K惑月怒触真断謝霊治構契功獣
{ }{upper arrow}商択注故災害寝堂縱観帝í鬼服
巻衣怨打玉談細??購軽孫編損P{right arrow}験
旋店歩ぇ暦障宙集爪盗将継側景改浮
遊導築喰舞剛丸粉烈語価告届避優
護至因ぺ適経誇U華試襲宇顔習都
禁求①②保床滝問題ú越鳴魅財布冠
英枚矢針雪ぉ{??}組測識退增脳若旦怖
Z徵与硬志系ヅ“”翻訳WLVHM
{ }{ }謎束散惠関燕壁棲焼糧八VS到
威容叉朱雀豆漆袋激柄雄衛盾槍刈吹
息瞬栄頑驚末酬準ぃ皆每密裂ぅ件量
業骨図アクセサリー堅澄刃抱第ぢ<
替Àà各市制際蹴{ }{ }髪遥短例雰
囲竸慣??句眼難還省崩兼灯拝③控
腹崎五討溪谷昇&嵐笑刻紅砲胴屑鉢
嶺隊魂煙脚踏伸棒誘咆哮静徐否催疾
閃召緊払奴歯貸恩狙珍悟労久泊疲植
補麗産析遭遇条義扱溜醒砕府描飾点
咲鋭根{ }寂座惜縛#◯ゎゐゑヂヮ
ヰヱヵヶ{ascii characters...}
{ascii characters...}
{ascii characters...}
噛河柱{ }{ }{ascii characters...}{(R)}{TM}
{(C)}~朝憧憬祈沈默荒愉快燃錯律廊祝
晴門基順接項況杯嬌派標頃拾類資賞
既式幅紫低響仮預宴序盤撒伐由??
典跳潰泡即睡臭衝痛恨鼻吐陏申済仏
騷充託暮畑支援勢鎖嫌才陣獲指揚慢
欲陥周承歳歓迎茂採苗鍛逸洗練凄興
逆借奪僞映聴詳縮霸摔彗貫奏粒畜積
歪鉤温般磨司清掃織匹??包妖霧憂純
??七浴靭養抽狂筋甘貯藏評級鉱群
囚淵ⅠⅡ環境鑑妙虚憐並孤涯背飼<
独JQabcdefghijklm
npqrtuwxyz一台尻尾奈
$;=@[\]^`{|}~
« Last Edit: February 06, 2011, 11:06:25 am by utunnels »

Vehek

  • Errare Explorer (+1500)
  • *
  • Posts: 1761
    • View Profile
Re: [10000] Rip the complete English, Japanese, and French scripts of CT DS
« Reply #29 on: February 06, 2011, 01:37:41 am »
While my table was initially based off comparing my dumps to the SNES script and filling it in from that, I never got around to inputting all the kanji from the SNES version.

For instance, that C383 in alfador's test of his program was 事 in the SNES version.
And CC8D is 鐘.
« Last Edit: February 06, 2011, 01:42:59 am by Vehek »