Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Assignment clever-challenge. #8

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

CamilleEscher
Copy link

No description provided.

@MathieuNls
Copy link
Owner

Hi,
Could you comment a bit on the approach you took ? See #6 or #4 for reference.
M.

Copy link

@lbajolet lbajolet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not see any code detecting function calls or maybe I have misunderstood some part of the code, could you confirm if this is missing?
Also, please be consistent with indentation, either choose tabs or spaces, but keep to it.

main.go Outdated
}
res := result{}
for _, f := range files {
file_path := path.Join(diff_dir_path, f.Name())

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Watch out with the path package as it is not intended for file system path manipulation, filepath is more suited for this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Noted.

main.go Outdated
task_list := []string{"DIFF", "REGION", "UPDATE"}
for scanner.Scan() {
if scanner.Text() != "" {
if len(scanner.Text()) > 0 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I speak for myself, and possibly others will agree that this can be &ed for more linear code, this'll help readability

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree.

main.go Outdated
if len(scanner.Text()) > 0 {
for _, task := range task_list {
success, on_success := execute_task(&task, scanner.Text(), res)
if success && len(on_success) == 0 {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be simplified by the negative; i.e. if !success { continue }

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it works

main.go Outdated
if success && len(on_success) == 0 {
continue
} else if success {
task_list = on_success

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand what you are doing, is it some state machine?
It may require a bit of commenting since it was rather confusing for me at first.

Copy link
Author

@CamilleEscher CamilleEscher Sep 28, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it is kind of a state machine where I have mainly tree states, the states determine which pattern to search in priority, like a very simple waterfall where we test the most complex and intricated task when no other task succeded, the tasks being finding clue about state transition, states 'being in a diff headline' (DIFF), 'in a region headline' (REGION) 'in diff content' (UPDATE). When we are in state UPDATE, we have to check all states with priority DIFF then REGION then UPDATE and FUNCTION CALL.

main.go Outdated

func execute_task(task *string, line string, res *result) (bool, []string) {
success := false
on_success := make([]string, 0)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a small detail, but this should be declared as nil first

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

}
}
}
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a fallback for missing future cases may be worth it

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

main.go Outdated
func execute_task(task *string, line string, res *result) (bool, []string) {
success := false
on_success := make([]string, 0)
if *task == "DIFF" {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going towards a state-machine approach, I'd suggest using enums for that kind of manipulation, it'll be way lighter than strings for states.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I just found out how to write enums.

main.go Outdated
// Extract $file_path_a and $file_path_b if line = 'diff --git a/$file_path_a b/$file_path_b'
func is_new_diff(line *string, res *result) bool {
new_diff := false
if line != nil {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is personal preference (and is something advocated by several open-source projects), but I think you should linearize your code for easier understanding of what's happening.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright, I see that I refined a lot of conditions and indeed there refacto to do.

main.go Outdated
new_diff := false
if line != nil {
if len(*line) > 10 {
if (*line)[:11] == "diff --git " {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both those lines could be replaced by a strings.HasPrefix

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It increased a bit performances.

@CamilleEscher
Copy link
Author

CamilleEscher commented Sep 28, 2018

Hi,
Could you comment a bit on the approach you took ? See #6 or #4 for reference.
M.
Hi Mathieu,
I am currently reviewing my code, I understand the code itself would benefit from some comments, and as I am currently reading and correcting the code accordingly.
As go is new for me I am learning a lot with this project, and I was looking for a quick method, that would solve a significant proportion of cases, focusing on the source files in C as I noticed it represented the majority of the problem.
My current method read the diff files lineby line and target a simple case of function calls using regexp searching a pattern preceding a pair of closed parenthesis for function calls, which is really basic. Prototypes of C functions will also match this pattern, which is a limit of the model.
Reading line by line with a scanner is not really efficient as the all process takes around 3 to 4 seconds to complete.

I am currently working on a new version where I read the file character by character and try to apply a combination of expectation maximization on Hidden Markov Models to learn the statistics of an HMM which state would be
UNINITIALIZED, IN_DIFF_HEADLINE, IN_DIFF_FILE_NAME, IN_DIFF_HEADER, IN_REGION_HEADLINE, IN_DIFF_CONTENT, IN_FUNCTION_CALL
Then use the viterbi algorithm to infer the sequence of hidden state under each character of the diff file, and infer the position of the data of interest.

Thank you for this, and thank you Lucas for your comments, I learned from them and am working on it.

Best regards
Camille

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants