Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clever Challenge - Rust #10

Open
wants to merge 18 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
/target
**/*.rs.bk
96 changes: 96 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[package]
name = "clever-challenge"
version = "0.1.0"
authors = ["Tormyst <[email protected]>"]

[dependencies]
regex = "1.1.0"
lazy_static = "1.2.0"
9 changes: 0 additions & 9 deletions Gopkg.lock

This file was deleted.

22 changes: 0 additions & 22 deletions Gopkg.toml

This file was deleted.

80 changes: 31 additions & 49 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,27 +1,6 @@
# Clever Initiative Challenge
# Clever Initiative Challenge Implementation

The Clever-Initiative is a team of the Technology Group @ Ubisoft Montreal. Our goal is to discover, improve and promote state-of-the-art software engineering practices that aim to ease the production of quality. By augmenting the quality of our products, we mechanically improve the productivity of our teams as they can focus creating features rather than fixing bugs.

The Clever-Initiative is the team behind the [Submit-Assitant](https://montreal.ubisoft.com/en/ubisoft-la-forge-presents-the-commit-assistant/) (a.k.a Clever-Commit) that received some press coverage recently: [Wired](http://www.wired.co.uk/article/ubisoft-commit-assist-ai), [MIT](https://www.technologyreview.com/the-download/610416/ai-can-help-spot-coding-mistakes-before-they-happen/), and [more](https://www.google.ca/search?q=commit+assistant+ubisoft). The scientific foundation behind our work have been accepted for publication to [MSR'18](https://montreal.ubisoft.com/en/clever-combining-code-metrics-with-clone-detection-for-just-in-time-fault-prevention-and-resolution-in-large-industrial-projects-2/), [CPPCON'18](https://www.youtube.com/watch?v=QDvic0QNtOY).

We are currently looking for trainees to join us (Winter'19). The length and start date of the internship will be discussed on a per applicant basis.

## Trainees

A trainee applicant must:

- Be engaged in a computer science (or related) university program.
- Be able to work in Canada legally.
- Be willing to come to Montreal.
- Be able to read, understand and implement scientific papers.
- Know:
- versionning systems (git, perforce, ...)
- c/c++/csharp or java
- Know or be willing to learn:
- golang
- docker
- sql
- angular
Within this repo, you will find an implementation of the Clever Initiative Challenge written in rust. The focus will be to keep the goals for extendability, maintainability and efficiency during the process of solving the problem.

## The Challenge

Expand All @@ -35,38 +14,41 @@ The challenge for trainee applicant consists in parsing a few diffs--in the most

All these stats are to be computed globally (i.e. for all the diffs combined).

In the main.go file; you'll find the `compute` method that needs to be implemented.
## Permanent Positions

```golang
//compute parses the git diffs in ./diffs and returns
//a result struct that contains all the relevant information
//about these diffs
// list of files in the diffs
// number of regions
// number of line added
// number of line deleted
// list of function calls seen in the diffs and their number of calls
func compute() *result {
I am applying for a permanent positions. I understand that this is meant for less permanent positions. However, as I was directly sent here, I will be completing this challenge.

return nil
}
```
## Why rust?

To enter the challenge:
I wish I could do this in Go. I don't believe in presenting something in a programing language while I am learning it. As I would be new to Go, I would feel it is not my best work.

- Fork this repository
- Implement your solution
- Open a pull request with your solution. In the body of the pull request, you can explain the choices you made if necessary.
While I am new to rust, my major Rust side project [FeGaBo](github.com/tormyst/FeGaBo) has given me enough experience with the language.

You can alter the data structure, add files, remove files ... you can even start from scratch in another language if you feel like it.
However, note that we do use golang internally.
Using rust has interesting benefits. While possible to make efficient solutions in other languages, the advantages that rust offers pushes a lot of work onto the compiler. Ensuring that a task can multithreaded is simple in rust as if the operation is not memory safe, Rust will not compile. Casts and mutability are brought forward ensuring only what needs to be done is.

If you don't feel comfortable sharing your pull-request with the world (pull-request are public) you can invite me (@mathieunls for github, bitbucket and gitlab) and Florent Jousset (@heeko for github, bitbucket and gitlab) to a private repo of yours. Don't bother to send code by email, we won't read it.
While this solution did not use threads, they could be added relativly easaly to have each file be processed individualy.

## Permanent Positions
## Building this project

This is a `cargo` project tested on stable rust.

Once everything installed, best done through rustup (which can be installed through most package managers), you can use `cargo run` to run the solution

For speed, try compiling under release: `cargo run --release`

We are also looking for permanent members to join our team. If you are interested, mail our human resource [contact](mailto:[email protected]?subject=Clever-Initiative) with your resume. You can submit your pull request for the challenge. However, you'll be subjected to an in-depth (much harder) coding test. This one has been conceived for students only and it might not be worth your time to take it ;).
## Anything else?

- [ ] Software Engineer / ML Dev (python, go, ml, sql, ...)
- [ ] Software Engineer / Backend Dev (go, c, cfg, ast, k8s, redis, ...)
- [ ] Software Engineer / Tool devs (csharp, python, cfg, ast, ...)
A check through of A trainee applicant must:

- Be engaged in a computer science (or related) university program. (Compleated with masters)
- Be able to work in Canada legally. (Yes)
- Be willing to come to Montreal. (Already here)
- Be able to read, understand and implement scientific papers. (Did that during my masters)
- Know:
- versionning systems (git, perforce, ...) (I know both of those also svn and something called accurev)
- c/c++/csharp or java (c, c++,csharp and java)
- Know or be willing to learn:
- golang (It's next on my list, just wanted to get through this first)
- docker (I know a bit about how they work and have used it several times, but not in depth)
- sql (A few variants)
- angular (I am no designer, but I have made some fun things in angular, but not an expert)
37 changes: 0 additions & 37 deletions main.go

This file was deleted.

50 changes: 0 additions & 50 deletions result.go

This file was deleted.

49 changes: 49 additions & 0 deletions src/diff/diff_parse.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,49 @@
use result::Result;
use regex::Regex;

/// Returns the two filenames present in the diff header.
/// We are assuming --git, and this will ensure it as any non diff header will crash.
/// This is important as otherwise, we would be storing each filename twice: once for a path and
/// once for b path.
///
/// This implementation assumes a path is any collection of characters that is not a space.
pub fn header_filenames(header: &str) -> (String, String) {
lazy_static! {
static ref HEADER_GIT_FILENAME: Regex = Regex::new(r"^diff --git a/([^ ]+) b/([^ ]+)$").unwrap();
}
let groups = HEADER_GIT_FILENAME.captures(header).unwrap();
(groups.get(1).unwrap().as_str().to_string(), groups.get(2).unwrap().as_str().to_string())
}

/// Finds and adds all functions to a given result structure.
/// This is the slowest part of the diff stats process as we have to iterate through a utf8 string
/// doing constant comparisons.
/// For the purposes of this chalange, I am defining a function given the regex bellow. A number
/// of word characters followed imidiatly by an open bracket. The issue with this structure is
/// that it is unlikely to be correct. Languages like C can have as much whitespace as wanted
/// between the name of the function and the open bracket. Likewise, the string 'for(' should not
/// be a function identifier, however this regex would.
///
/// All in all, this is the best that can be done in a short time as the alternetive would be to
/// actually understand whatever language we are sifting though and then find funtions as defined
/// by the language itself.
///
/// # Why are we passing in a result structure?
///
/// This function has the beautiful disadvantage of being a variable size responce.
/// We don't know how many bytes any return of find_functions will take as it may return 0 or 5.
/// We however could return a Results structure: Initializing say 10_000 result structures, with
/// 10_000 empty sets for filenames is fairly useless.
/// Returning HashSet: We could return just the part of the results we wanted, the hashset and
/// create a function to add them together, or to open the underlying structure of the result.
/// I went with passing the result structure in directly. This is something I feel needs to be
/// fixed, however many other options seem unoptimal.
pub fn find_functions(string: &str, result: &mut Result) {
lazy_static! {
static ref FUNCTION_REGEX: Regex = Regex::new(r"\w+\(").unwrap();
}
FUNCTION_REGEX
.find_iter(string)
.map(|function| format!("{})", function.as_str()))
.for_each(|string| result.add_function_call(string))
}
Loading