Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide Sink/Source implementations writing to stdout/reading from stdin #375

Open
fzhinkin opened this issue Aug 19, 2024 · 15 comments
Open

Comments

@fzhinkin
Copy link
Collaborator

Some applications may require reading from stdin or writing to stdout, but there's no built-in factories/functions returning corresponding Sink/Source implementations.

It should be trivial to provide them for all targets except JS running in the browser.
For the latter, we can return an always-exhausted Source and a Sink writing to console.log.

@lppedd
Copy link
Contributor

lppedd commented Aug 19, 2024

Very welcomed enhancement. I was considering implementing this kind of project, and having stdin and stdout for Native implemented here would definitely make my life easier.

stderr would also be useful.

@JakeWharton
Copy link
Contributor

JakeWharton commented Aug 19, 2024

On native POSIX, being able to wrap a raw sink / raw source around any file descriptor would be convenient. The biggest downside is exposing the type-unsafe FD parameter as just an Int or whatever. But presumably it would be the implementation you'd use for stdin and stdout on native anyway, so having it public saves people from getting it wrong when they want to read/write some other FD.

@fzhinkin
Copy link
Collaborator Author

we can return an always-exhausted Source and a Sink writing to console.log.

console.log/error won't work here as is: RawSink is dealing with raw bytes, not a strings. We can, probably, try to reinterpret these bytes as string ourselves, and then emit the result into console.log, but I'm not sure if it's worth the effort.

@fzhinkin
Copy link
Collaborator Author

Another option to consider: if these stdin/out/err Source/Sinks should be buffered or Raw.
Buffered sinks and sources could reduce chances of closing application-wide streams/files accidentally by writing:

stdoutSink().buffered().use {
  it.writeString("Goodbye, /proc/self/fd/1")
}

Instead, it would be:

val out: Sink = stdoutSink()
out.writeString("Hey!")
out.flush()

@lppedd
Copy link
Contributor

lppedd commented Aug 19, 2024

console.log/error won't work

For the browser I'd just avoid implementing it for the first iteration, let's see how usable the outcome is first.
For Node.js you can use process.stdin/stdout/stderr with fs.readSync I suppose, although for this kind of streams it would cool to have a suspending variant at some point.

@fzhinkin
Copy link
Collaborator Author

Yeap, for NodeJs everything works with existing FileSink and FileSource implementations (assuming their constructors will accept FD instead of a file path).

@lppedd
Copy link
Contributor

lppedd commented Aug 19, 2024

It looks like you'll just have to create secondary constructors.

I see that the Node.js version uses readFileSync. You could switch to readSync and keep your own Buffer instance around, which would also avoid having to assert (!!) it.

@qwwdfsad
Copy link
Member

Another option to consider: if these stdin/out/err Source/Sinks should be buffered or Raw.

When re-doing readln, we explicitly ruled out buffering to avoid interference with other platform sources (e.g. System.in or System.console()) -- the idea was to be able to have independent readLine calls between various sources.
The argument mostly holds if we are going to provide top-level access like Source.stdin

@qwwdfsad
Copy link
Member

On a side note, we might be able to finally address the annoying windows console encoding issue

@fzhinkin
Copy link
Collaborator Author

The argument mostly holds if we are going to provide top-level access like Source.stdin

Then we have to use some alternative buffering implementation as the current one prefetch data eagerly.

@fzhinkin
Copy link
Collaborator Author

fzhinkin commented Aug 21, 2024

Source.stdin could be Source, not a RawSource. And we can use either new implementation, or parameterize RealSource to buffer only requested number of bytes from the underlying input stream / fd. That'll prevent unwanted interference with other platform sources.

However, that'll work until the first .buffered() call on Source.stdin as a newly created buffered source will buffer eagerly. And since we encourage users to use RawSource where possible, Source.stdin will likely be wrapped into yet another buffered source somewhere, where its origin will be unknown:

fun parseJson(source: RawSource) {   
   val buffered = source.buffered()
   ...
}

// jq.kt
fun main(args: Array<String>) {
   val inputFileName = getFileName(args)
   if (inputFileName == "-") {
     parseJson(Source.stdin)
   } else {
      SystemFileSystem.source(Path(inputFileName)).use {
         parseJson(it)
      }
   }
}

Yet another issue is that Source.exhausted needs to read at least a single byte from an underlying source, so it will always cause interference.

@fzhinkin
Copy link
Collaborator Author

After some discussions, it seems like the fact that stdin-backed Source will mess up with readln is not a problem (or, to be more precise, it's not a problem we can solve for Source):

  • unbuffered stdin-backed RawSource won't have this problem, it'll read requested number of bytes only;
  • to get the buffered version, one has to explicitly write something like StandardInputSource.buffered();
  • and, well, buffered IO requires bufferization, so there's not so much we can do. ;)

It's worth mentioning that such a source won't replace something similar to Java's Console class and if interactive I/O through a virtual terminal is required, some other interface should be introduced.

The same considerations are also applicable to stdout/err-backed Sinks.

@fzhinkin
Copy link
Collaborator Author

Other questions, however, remain open:

  • How closing stdin/err/out sources/sinks should behave? Should it be an error or no-op?
  • On Java, should these sinks/sources follow changes of corresponding System.in/out/err streams? For instance, if someone set new System.in stream, should previously created source continue reading from the old stream, or should it switch to a newly set stream?

@JakeWharton
Copy link
Contributor

On Java, should these sinks/sources follow changes of corresponding System.in/out/err streams?

I'm a pretty strong "no" on this.

The analogous behavior on the JVM would be calling System.out to obtain an OutputStream, and wanting that OutputStream instance to change where it writes to if someone calls System.setOut.

This problem isn't limited to the JVM. If you close the stdin FD on native, the next opened file will get the "standard" STDIN FD number.

@fzhinkin
Copy link
Collaborator Author

This problem isn't limited to the JVM. If you close the stdin FD on native, the next opened file will get the "standard" STDIN FD number.

It seems to be the same as

follow changes of corresponding System.in/out/err streams

So if we want not to follow, on native a duplicate FD is needed. As a bonus, duplicating an FD will solve an issue with accidentally closing a stdin/out/err stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants