Subject:
From: Ethan Dalool
Date: Tue, 3 Mar 2020 17:55:46 -0800
Hmm, thanks for the tip, but megatools-1.11.0-git-20181018-winxp and megatools-1.11.0-git-20191107-winxp (the oldest and newest versions from /experimental) aren't working either.

I'll come up with an alternate solution for handling my Unicode files for the time being, and I'll keep an eye on megatools updates too.

Thanks

-----Original Message-----
From: Ondřej Jirman <megatools@megous.com> 
Sent: Tuesday, March 3, 2020 5:45 PM
To: Ethan Dalool <ethan@voussoir.net>
Subject: Re: 1.10.2 on Windows, megaput files with unicode names = error opening file

On Tue, Mar 03, 2020 at 05:30:43PM -0800, Ethan Dalool wrote:
> My understanding of Windows' codepages / encodings is that the names 
> on the filesystem are stored as UTF-16. I understand what you mean 
> about CHARSET only being an indicator and not actually controlling the 
> codepage, but that's what the `chcp` command is supposed to do, which 
> I used in my screenshots. There's no dedicated chcp for UTF-16 though, 
> only 65001 for UTF-8. I've tried all the chcp codes I can find online and nothing's working yet.
> 
>  
> 
> One last thing I noticed is that the problem also happens if I do 
> megaput *.txt, so this issue is NOT  just the encoding of the 
> commandline arguments, but it's the encoding of the filenames AFTER 
> the globbing system has resolved *.txt into my filenames. So I'm not 
> sure what that contributes to this problem.
> 
>  
> 
> Well, at this point it looks like the issue probably can't be fixed 
> without replacing a great amount of glib calls in the megatools source 
> code, and I don't have enough experience in C to make that change 
> myself, so I guess I'll give up for now. Thanks for your continued 
> hard work and your patience with this email thread.

One last thing you may try is to use the windows xp version of megatools, since that's compiled with older glib version. Maybe newer glib changed something that broke this, since I remember this working in the past.

regards,
	o.

>  
> 
> Ethan
> 
>  
> 
> -----Original Message-----
> From: Ondřej Jirman <megatools@megous.com>
> Sent: Tuesday, March 3, 2020 4:51 PM
> To: Ethan Dalool <ethan@voussoir.net>
> Subject: Re: 1.10.2 on Windows, megaput files with unicode names = 
> error opening file
> 
>  
> 
> On Tue, Mar 03, 2020 at 04:19:41PM -0800, Ethan Dalool wrote:
> 
> > Hi Ondřej,
> 
> > 
> 
> >  
> 
> > 
> 
> > I really appreciate your time and the software that you've created,
> 
> > but I feel that you are responding to my emails without fully
> 
> > recognizing the problem that I am showing. I am a competent user, a
> 
> > programmer, I deal with Unicode on the command line daily, and I'm
> 
> > aware of the limitations and common problems dealing with Unicode on
> 
> > the command line. But this megaput problem is unique to me. The 
> > error
> 
> > message being shown on the screen is a message I recognize when a software tries to open illegal characters.
> 
> > 
> 
> >  
> 
> > 
> 
> > I am attaching a series of screenshots showing every possible
> 
> > incantation of megaput in cmd, powershell, and python subprocess; 
> > with
> 
> > charset=UTF-8,
> 
> > 
> 
> > 
> 
> > 
> 
> > CP65001, and 65001; assigned in-shell via variable name, in-shell 
> > via
> 
> > the chcp command, and via the system environment variable editor. 
> > All
> 
> > of them have the same issue.
> 
> > 
> 
> >  
> 
> > 
> 
> > I only found ONE unique result, which is to set CHARSET=UTF-16. This
> 
> > creates a different error message (megaput.exe:21244): GLib-CRITICAL **: 16:14:31.850:
> 
> > ÿ_g, but it still does not upload.
> 
> > 
> 
> >  
> 
> > 
> 
> > Perhaps there is a correct incantation here somewhere, but this glib
> 
> > library is behaving differently than any other Unicode-enabled piece
> 
> > of command line software that I use.
> 
>  
> 
> Yes, glib windows support doesn't use windows unicode functions, but old non-unicode functions that work with system's codepage, and glib converts to the UTF-8 and back to the system's codepage when calling *A functions.
> 
> This is true for both command line params and console output.
> 
>  
> 
> That probably means you can't just set CHARSET envvar to whatever you like, but it also has to match your system's codepage. (Encoding used by winapi's *A functions). You just have to use CHARSET to inform megatools of the system's current codepage.
> 
>  
> 
> There are some alternatives, like using g_win32_get_command_line:
> 
>  
> 
>    
> <https://developer.gnome.org/glib/stable/glib-Windows-Compatibility-Fu
> nctions.html> 
> https://developer.gnome.org/glib/stable/glib-Windows-Compatibility-Fun
> ctions.html
> 
>  
> 
> which meagatools don't use, because it uses g_option_context_parse everywhere.
> 
>  
> 
> regards,
> 
>                 o.
> 
>  
> 
> >  
> 
> > 
> 
> > Thanks,
> 
> > 
> 
> > Ethan
> 
> > 
> 
> >  
> 
> > 
> 
> > -----Original Message-----
> 
> > From: Ondřej Jirman < <mailto:megatools@megous.com> 
> > megatools@megous.com>
> 
> > Sent: Tuesday, March 3, 2020 3:30 PM
> 
> > To: Ethan Dalool < <mailto:ethan@voussoir.net> ethan@voussoir.net>
> 
> > Subject: Re: 1.10.2 on Windows, megaput files with unicode names =
> 
> > error opening file
> 
> > 
> 
> >  
> 
> > 
> 
> > On Tue, Mar 03, 2020 at 09:36:26AM -0800, Ethan Dalool wrote:
> 
> > 
> 
> > > Hi,
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > The fact of the matter is, I don't actually use Powershell normally. I was only using it to prove that the charset of my terminal wasn't the cause of the problem.
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > Actually, I discovered this bug because I'm calling megaput from
> 
> > 
> 
> > > Python's `subprocess` module. Specifically, my Python code is
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > command = [megaput, '--config', f'{config_file}', f'{filename}']
> 
> > 
> 
> > > 
> 
> > 
> 
> > > subprocess.check_output(command, stderr=subprocess.STDOUT,
> 
> > 
> 
> > > timeout=180)
> 
> > 
> 
> >  
> 
> > 
> 
> > Hello,
> 
> > 
> 
> >  
> 
> > 
> 
> > this means that you're passing filenames in UTF-8 encoding to megatools.
> 
> > 
> 
> >  
> 
> > 
> 
> > Therefore you need to configure the CHARSET environment variable accordingly.
> 
> > 
> 
> > Either try the MS value of CP65001 or UTF-8.
> 
> > 
> 
> >  
> 
> > 
> 
> > The  < <https://megous.com/git/megatools/tree/README#n80> https://megous.com/git/megatools/tree/README#n80>  <https://megous.com/git/megatools/tree/README#n80> https://megous.com/git/megatools/tree/README#n80 description is inaccurate in so far as CHARSET envvar is also used for converting command line arguments to UTF-8 from the encoding specified in CHARSET.
> 
> > 
> 
> >  
> 
> > 
> 
> > regards,
> 
> > 
> 
> >                 o.
> 
> > 
> 
> >  
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > I didn't mention this earlier because I wanted to get straight to the point, and not distract from the conversation with Python.
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > To be clear, I use subprocess with other programs on a daily basis, and they handle Unicode filenames ok. Even when I use windows cmd, which has an even more restrictive charset than powershell, it doesn't matter because subprocess passes the Unicode to the program properly. You said that glib is doing something -> utf-8 conversions, but from my experience with Python and subprocess, it should be receiving utf-8 from my calling process just fine. I have attached a screenshot demonstrating that I can use Unicode in my shell even when the shell can't display it.
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > Also, your link says this (emphasis mine):
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > On Unix, the character sets are determined by consulting the environment variables G_FILENAME_ENCODING and G_BROKEN_FILENAMES.
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > On Windows, the character set used in the GLib API is always UTF-8 and said environment variables have no effect.
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > I am a fellow programmer so I understand you're trying to close 
> > > this
> 
> > 
> 
> > > ticket quickly. But I am quite sure I am doing everything properly for my system.
> 
> > 
> 
> > > Megaput is not accepting my Unicode input which is why I'm filing 
> > > a
> 
> > 
> 
> > > bug report.
> 
> > 
> 
> >  
> 
> > 
> 
> >  
> 
> > 
> 
> >  
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > Thanks,
> 
> > 
> 
> > > 
> 
> > 
> 
> > > Ethan
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > -----Original Message-----
> 
> > 
> 
> > > From: Ondřej Jirman < < <mailto:megatools@megous.com> 
> > > mailto:megatools@megous.com>
> 
> > >  <mailto:megatools@megous.com> megatools@megous.com>
> 
> > 
> 
> > > Sent: Tuesday, March 3, 2020 9:07 AM
> 
> > 
> 
> > > To: Ethan Dalool < < <mailto:ethan@voussoir.net> 
> > > mailto:ethan@voussoir.net>  <mailto:ethan@voussoir.net> 
> > > ethan@voussoir.net>
> 
> > 
> 
> > > Subject: Re: 1.10.2 on Windows, megaput files with unicode names =
> 
> > 
> 
> > > error opening file
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > On Tue, Mar 03, 2020 at 08:22:11AM -0800, Ethan Dalool wrote:
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > Hi, Ondřej. Thanks for your very fast response.
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > Your link says:
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > This is just a cosmetic issue. Internally, megatools always 
> > > > > work
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > with UTF-8
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > file names, and even if the tool's terminal output is corrupted,
> 
> > 
> 
> > > > files
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > names of downloaded/uploaded files will be correct.
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > But the reason I sent this email is specifically because the 
> > > > files
> 
> > 
> 
> > > > are
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > failing to upload. It's not just a cosmetic issue. Please see my error message again.
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > I don’t program in C++, but I know from Python experience that
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > attempting to `open()` a filename that contains invalid 
> > > > characters
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > yields the OS exception "Invalid argument". So when I see
> 
> > > > megatools
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > displaying questionmark filenames, even when I'm using 
> > > > Powershell
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > which capable of displaying UTF-8, and the "Cannot open file: Invalid Argument" exception, it makes me suspicious.
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > Hello,
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > none of the powershell stuff matters. The input/output from megatools is handled by glib library, which does utf8->something conversion when printing to the stdout and something->utf8 conversion when taking command line filename type arguments.
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > Glib uses some environment variables to decide what that something will be.
> 
> > 
> 
> > > 
> 
> > 
> 
> > > You need to have these environment variables set even under powershell, otherwise glib will cause a mess.
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > You can read about it here:  
> 
> > 
> 
> > > <https://developer.gnome.org/glib/stable/glib-Character-Set-Conver
> > > si
> 
> > > on
> 
> > 
> 
> > > .html>
> 
> > 
> 
> > >  < <https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion> https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion>  <https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion> https://developer.gnome.org/glib/stable/glib-Character-Set-Conversion.
> 
> > 
> 
> > > html
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > regards,
> 
> > 
> 
> > > 
> 
> > 
> 
> > >                 o.
> 
> > 
> 
> > > 
> 
> > 
> 
> > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > -----Original Message-----
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > From: Ondřej Jirman < < < <mailto:megatools@megous.com> 
> > > > mailto:megatools@megous.com>
> 
> > > >  <mailto:megatools@megous.com> mailto:megatools@megous.com>
> 
> > 
> 
> > > >  < <mailto:megatools@megous.com> mailto:megatools@megous.com>  
> > > > <mailto:megatools@megous.com> megatools@megous.com>
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > Sent: Tuesday, March 3, 2020 3:12 AM
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > To: Ethan Dalool < < < <mailto:ethan@voussoir.net> 
> > > > mailto:ethan@voussoir.net>
> 
> > > >  <mailto:ethan@voussoir.net> mailto:ethan@voussoir.net>  < 
> > > > <mailto:ethan@voussoir.net> mailto:ethan@voussoir.net>
> 
> > > >  <mailto:ethan@voussoir.net> ethan@voussoir.net>
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > Subject: Re: 1.10.2 on Windows, megaput files with unicode names 
> > > > =
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > error opening file
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > Hello,
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > On Tue, Mar 03, 2020 at 12:42:48AM -0800, Ethan Dalool wrote:
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > Hi,
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > First of all, thank you for megatools. I think it's the best
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > software of its kind for interacting with mega.
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > I'm glad megatools works for you.
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > I'm having some trouble using megaput to upload files with
> 
> > > > > Unicode
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > characters in the filename. It gives me this error:
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > > D:\software\megatools\1.10.2\megaput.exe --config mega.ini
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > "C:\outbox\ ȳ  ϼ   .zip"
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > ERROR: Upload failed for 'C:\outbox\?????.zip': Can't read 
> > > > > local
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > file
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > C:\outbox\?????.zip: Error opening file C:\outbox\?????.zip: 
> 
> > 
> 
> > > > > Invalid
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > argument
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > >  < < <https://megous.com/git/megatools/tree/README#n80> 
> > > > https://megous.com/git/megatools/tree/README#n80>
> 
> > > >  <https://megous.com/git/megatools/tree/README#n80> 
> > > > https://megous.com/git/megatools/tree/README#n80>
> 
> > 
> 
> > > >  < <https://megous.com/git/megatools/tree/README#n80> 
> > > > https://megous.com/git/megatools/tree/README#n80>
> 
> > > >  <https://megous.com/git/megatools/tree/README#n80> 
> > > > https://megous.com/git/megatools/tree/README#n80
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > does this help?
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > regards,
> 
> > 
> 
> > > 
> 
> > 
> 
> > > >             o.
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > I notice that the Unicode characters are being replaced by ? 
> 
> > > > > even
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > though Powershell is capable of displaying them, which leads 
> > > > > me
> 
> > > > > to
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > believe that somewhere internally, megaput is escaping the
> 
> > 
> 
> > > > > filename,
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > converting out-of- page characters to ? prior to upload, and
> 
> > > > > then
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > choking on it. I know that many programs do this kind of
> 
> > > > > escaping
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > for display purposes but clearly this escaped name shouldn't be going to the upload routine.
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > I hope this issue will be simple to resolve, and if I can
> 
> > > > > provide
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > anything else to make it easier please let me know.
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > >  
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > Thanks
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> > > > 
> 
> > 
> 
> > > 
> 
> > 
> 
> >  
> 
> > 
> 
> >  
> 
> > 
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
> 
>  
>