About Monkey 2 › Forums › Monkey 2 Programming Help › How to load file contained non-latin chars?
This topic contains 9 replies, has 4 voices, and was last updated by
nerobot 1 year, 4 months ago.
-
AuthorPosts
-
November 21, 2017 at 2:50 pm #11886
How to load file contained non-latin chars?
It’s very old problem I faced since blitzmax.
In monkey2: libc.stat( path,Varptr st ) returns -1, and it means error.
But why it happen? And how to fix that?
I’m on windows 10 and have Cyrillic chars.
November 21, 2017 at 7:59 pm #11890Basically, the libc module (and more…) needs to be rewritten for Windows to use the ‘wide char’ versions of various OS filesystem APIs.
This is not a problem on any other OSes because they all very sensibly use utf-8 c strings for all APIs, which is backwardly compatible with old school ascii and means I can use ‘standard c’ APIs everywhere – except Windows!
It’ll happen eventually…
November 24, 2017 at 8:28 am #11941Searching solution I found _wfopen() method (brother of fopen() ), but it requires wchar_t * , and I can’t to convert monkey’s string into wchar_t Ptr .
I gave up..
November 25, 2017 at 1:40 pm #11970I think I’m onto something. x’D
I’ve at least converted the strings to wchar_t.
Monkey1234567891011121314151617181920212223242526272829303132333435363738Namespace myapp#Import "<std>"#Import "libcore"#Import "textfile.txt"Using std..Using libc..Function Main()Local path:String = "textfile.txt"Local mode:String = "w"Local pathBuf:DataBuffer = New DataBuffer(path.Length*2)Local modeBuf:DataBuffer = New DataBuffer(1)path.ToWString( pathBuf.Data, path.Length*2)mode.ToWString( modeBuf.Data, mode.Length*2)Local path_w:wchar_t Ptr = Cast<wchar_t Ptr>(pathBuf.Data)Local mode_w:wchar_t Ptr = Cast<wchar_t Ptr>(modeBuf.Data)For Local pi := 0 Until pathBuf.LengthPrint pathBuf.PeekString(pi)NextPrint modeBuf.PeekString(0)Local file := _wfopen(path_w, mode_w)If(file = Null) Then Print "Failed"Print "Finished."End(Libc)ore.monkey2
Monkey12345Namespace libcExternFunction _wfopen:FILE Ptr( path:wchar_t Ptr,mode:wchar_t Ptr )Not sure if this works as it’s coded. It’s kinda hacky and probably doesn’t work. Lol. Almost there… Darn.
November 26, 2017 at 9:48 am #11986Yes, it doesn’t work.
Waiting for Mark’s implementation of utf-8.
November 26, 2017 at 12:52 pm #11987I’m not sure it would need rewritten entirely — I read this web site a while back, and the recommendation seems to be “just convert to/from WCHAR” only at the point where the Win32 API requires it:
utf8everywhere.org (Should jump to “Our Conclusions”.)
Portability, cross-platform interoperability and simplicity are more important than interoperability with existing platform APIs. So, the best approach is to use UTF-8 narrow strings everywhere and convert them back and forth when using platform APIs that don’t support UTF-8 and accept wide strings (e.g. Windows API). Performance is seldom an issue of any relevance when dealing with string-accepting system APIs (e.g. UI code and file system APIs), and there is a great advantage to using the same encoding everywhere else in the application, so we see no sufficient reason to do otherwise.
November 26, 2017 at 6:47 pm #11992“just convert to/from WCHAR” only at the point where the Win32 API requires it
This is the basic idea but it’s not trivial!
I should have it mostly done now though – just pushed to develop branch so feel free to check it out.
There is also a minor chance that I’ve borked something in the process, as this stuff is used in lots of places. Ted2go is a pretty good test of libc/filesystem stuff and that seems to be going OK but still…use with caution.
Also found this in the process:
A pretty good read on all the issues, and I agree with about 97% of it I think…
November 26, 2017 at 7:11 pm #11994That’s the web site I linked!
(Yeah, no doubt much more complex to implement than it appears anyway.)
November 26, 2017 at 7:59 pm #11995That’s the web site I linked!
That’ll teach me to skim read before coffee at 7.00 am…
And the wchar stuff isn’t all that complex to implement, you just need to be really careful of 2 huge C bogey-men – buffer over-runs and memory leaks!
November 27, 2017 at 3:00 am #12001I should have it mostly done now though – just pushed to develop branch so feel free to check it out.
Very nice news! It works for me.
And my modest $15 sent to this gentleman!
-
AuthorPosts
You must be logged in to reply to this topic.