Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chady Kamar #9

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Chady Kamar #9

wants to merge 6 commits into from

Conversation

chadykamar
Copy link

This is my solution to the challenge. It does have some issues, mainly with the function calls:

  • It considers all the files as source instead of ignoring .txt and .rst for instance.
  • It doesn't ignore keywords such as if, else, and so on so they can be tagged a functions.
  • More generally it wouldn't work for a lot of programming languages (non C-like or functional languages), or even for anonymous funcs/closures/C++ macros.
  • The regexp package seems to be quite slow in Go so it may be worthwhile to use something else to parse for function calls. The same program in Rust ran about 2x as fast mostly due to that.

Copy link

@lbajolet lbajolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While the code is relatively clean, I feel that the goroutines are too small in granularity. I think the work done on each of them is too small to justify their creation most of the time, and that most of the work done here could be done sequentially.

Also, the commit messages are not descriptive of the work being done in the commit itself.

main.go Outdated
}
for _, match := range matches {
// We'll keep only the function name i.e. remove the parentheses and params
functionCall := match[:len(match)-1]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Without any subgroup, the len(match) - 1 could become 0, unless I'm missing something

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The matches are all at least 2 characters long: the function's name followed by the opening bracket. This is only meant to remove that parenthesis. I'll replace it with https://golang.org/pkg/strings/#TrimSuffix for clarity. I also noticed an error in my regular expression which disallowed function names with a single character.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah my bad, I didn't realize we were working on a match, not on the matches array, my bad

main.go Outdated
go func() {
defer wwg.Done()
for range regionsChan {
regions++

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel this is a bit strange optimization, starting a goroutine to increment a variable seems counterproductive as the cost of switching context between the different tasks will likely be higher than doing the operation locally (assuming you have more goroutines than cores).

Also, the channels you use are unbuffered, this means that goroutines who want to write concurrently on the channel will have to wait for the input of another to be consumed by this routine before it can continue executing.

All in all, while parallellizing the computations on different files at once is likely a cause for improvement in speed, I feel that the other routines are wasted resources as they likely do too little to bring real improvements in performance.

I may however be wrong, if you are able to provide benchmarks to justify these small routines, I'd be happy to take a look at them

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do get better results this way. Just to clarify, are you referring to the 5 goroutines which do the incrementing/appending or the fact that I spawn a goroutine to process every line? My basic reasoning for spawning a routine for every line was that appending to a slice and inserting into a map can be expensive so I would rather not wait for the result to be computed to continue processing the incoming lines. The 5 routines and their respective channels are just an (ever so slightly faster) alternative to mutexes in this case.

@chadykamar
Copy link
Author

I've benchmarked a few different methods. You were correct in that the goroutines are too small in granularity. Instead of spawning one per line, it's better to spawn one per available CPU.

@chadykamar
Copy link
Author

chadykamar commented Nov 16, 2018

Benchmark results on quad-core 2 GHz core i7:

BenchmarkNoConcurrency-8                            	      20	  68126884 ns/op
BenchmarkConcurrencyReadingOnly-8                   	      20	  76652139 ns/op
BenchmarkConcurrencyChannelsOneGoroutinePerLine-8   	      30	  49373400 ns/op
BenchmarkConcurrencyMutexesOneGoroutinePerLine-8    	      30	  50116187 ns/op
BenchmarkConcurrencyMutexesOneGoroutinePerCPU-8     	      30	  47364315 ns/op
BenchmarkConcurrencyChannelsOneGoroutinePerCPU-8    	      30	  46514586 ns/op

You can run them with go test -bench=.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants