-
Notifications
You must be signed in to change notification settings - Fork 197
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make wget automatically determine the file name. #536
base: master
Are you sure you want to change the base?
Conversation
wget will try to determine the file name by: 1. Stripping all trailing slashes 2. Take everything right of the last remaining slash, meaning either the full domain name or the file or foldername on the server.
Thank you! This is something I've been meaning to do for a while. Instead of using a loop, it might be a little cleaner to use -- Trims everything after ? or # (means foo#bar or foo?bar=qux reduce to foo),
-- then trims trailing `/`
sUrl = sUrl:gsub( "[#?].*", "" ):gsub( "/+$", "" )
-- Find everything from the last / to the end of the string
return sUrl:match( "/([^/]+)$" ) or sUrl Note I've only done some limited testing, so this may be more susceptible to breaking. |
foo#bar should be reduced to bar, however I'm not sure if foo?bar=s should be reduced to foo. As for the pattern I tested them against a bunch of edge cases and they seem to work. |
That's a good point. I don't have access to CC in MC right now, but both CCEmuX and CCEmuRedux also emit |
On NTFS (the most commonly used Windows filesystem) it will fail. |
Ok it will now strip anchors and url parameters and I also applied the suggestion of SquidDev to use patterns and gsub/find |
…alled before checking if the URL is valid
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any opinions on this?
|
||
local function getFilename( sUrl ) | ||
sUrl = sUrl:gsub( "[#?].*" , "" ):gsub( "/+$" , "" ) | ||
return sUrl:match( "/([^/]+)$" ) or " " |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure if the or " "
should be here or further down(See comment on line 52)
@@ -43,7 +48,7 @@ end | |||
|
|||
-- Determine file to download | |||
local sUrl = tArgs[1] | |||
local sFile = tArgs[2] | |||
local sFile = tArgs[2] or getFilename(sUrl) | |||
local sPath = shell.resolve( sFile ) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be replaced by shell.resolve( sFile or " ")
On my machine (Windows), I think it would be better to move the |
Ok in that case I will put the |
I just noticed that the help page of wget should be changed to reflect this change. Going to do that this evening. |
--Check if the URL is valid | ||
local ok, err = http.checkURL( sUrl ) | ||
if not ok then | ||
if err then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if it'd be worth doing printError( err or "Failed. ")
instead. Though err
shouldn't ever be nil, so neither is really needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is done it would probably be better to use "Invalid URL" or something like that instead of "Failed." to distinguish it from the "Failed." message caused when the download failed.
Edit: But you are right in any way the if is not needed. Just copied the checking function from the original
Make wget automatically determine the file name.
Sometimes when using wget it is pretty annoying having to write the file name again.
This PR makes that unnecessary by automatically determining the file name.
This is done by stripping all trailing slashes and afterwards taking everything after the last remaining slash.
This is either the domain name if used like this: https://domain.tld or the file name if used liked this: https://domain.tld/folder/file.ext