Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cross-file goto references #120

Open
dannypsnl opened this issue Jul 29, 2023 · 9 comments
Open

Cross-file goto references #120

dannypsnl opened this issue Jul 29, 2023 · 9 comments

Comments

@dannypsnl
Copy link
Contributor

I remember we can goto references in same file in the current implementation, but usages in another file won't be listed.

@6cdh
Copy link
Contributor

6cdh commented Jul 29, 2023

A reimagined version looks like this:

#lang racket

;; 3 symbol levels

;; 1. workspace level
;;    it's defined in a module and used by other modules,
;;    identified by (list module name) or (list module location)
;; 2. file level
;;    it's a top level symbol in a module,
;;    identified by (list module name) or (list module location)
;; 3. local level
;;    it's a local symbol, identified by (list module location)

;; module -> location -> (listof location)
(define (goto-reference mod pos)
  (define symbol-info (get-symbol-info-at pos))
  (if (imported? symbol-info)
      (get-workspace-refs (source-mod symbol-info) (source-name symbol-info))
      (get-local-refs mod pos)))

;; module -> symbol -> (listof location)
;; get cross files locations
(define (get-workspace-refs mod name)
  (for*/list ([mod (get-all-used-mods mod name)]
              [refs (get-file-refs mod name)]
              [ref refs])
    ref))

;; TODO: module -> location -> (listof location)
(define (get-local-refs mod pos) #f)

;; TODO: module -> symbol -> (listof module)
(define (get-all-used-mods mod name) #f)

;; TODO: info -> module
(define (source-mod info) #f)

;; TODO: info -> symbol
(define (source-name info) #f)

;; TODO: module -> symbol -> (listof location)
(define (get-file-refs mod name) #f)

;; TODO: position -> info
(define (get-symbol-info-at pos) #f)

;; TODO: info -> boolean
(define (imported? info) #f)

@dannypsnl
Copy link
Contributor Author

dannypsnl commented Jul 29, 2023

I think maintain a dependency graph will help, since the changes to the graph usually small (add/delete a file)

The considered help case is: only seek usage in user modules of current file.

@6cdh
Copy link
Contributor

6cdh commented Jul 29, 2023

I think so

@dannypsnl
Copy link
Contributor Author

dannypsnl commented Aug 29, 2023

An over simple test about using concurrency vs parallelism for file checking, one can pick arbitrary test-dir, I use racket-langserver itself as target.

Update 2024/9/6 (v8.14 [cs]), I realize my old program has a big bug……

  • Sequential: ~14.053sec
  • Concurrency: ~15.803sec (racket thread)
  • Parallelism: ~12.389sec (racket place)

Main file

(require drracket/check-syntax)
(require racket/place/dynamic)

(time
 (for ([x test-dir])
   (when (path-has-extension? x #".rkt")
     (show-content x)))
 )

(time
 (define ts (for/list ([x test-dir])
              (thread
               (lambda ()
                 (when (path-has-extension? x #".rkt")
                   (show-content x))))))

 (for ([t ts])
   (thread-wait t))
 )

(define N 8)
(time
 (let ([pls (for/list ([_ (in-range N)])
              (dynamic-place "worker.rkt" 'place-main))])

   (for ([x (sequence->list test-dir)]
         [i (in-naturals)]
          #:when (path-has-extension? x #".rkt")
         )
     (define idx (modulo i N))
     (define p (list-ref pls idx))
     (place-channel-put p x)
     )
   (for ([p pls])
     (place-channel-put p 'done))

   (map place-wait pls))
 )

worker.rkt

(provide place-main)
(require drracket/check-syntax)

(define (place-main pch)
  (let loop ([tocheck (place-channel-get pch)])
    (match tocheck
      ['done (void)]
      [tocheck
       (show-content tocheck)
       (loop (place-channel-get pch))]))
  (place-channel-put pch 'done))

@dannypsnl
Copy link
Contributor Author

It's unfortunate that "place" might not really get faster, although it might improve the respond ability.

@6cdh
Copy link
Contributor

6cdh commented Sep 6, 2024

The result is similar on my machine (Windows 11, WSL 2):

sequential
cpu time: 14855 real time: 15051 gc time: 4653
thread
cpu time: 20192 real time: 20195 gc time: 9199
place
cpu time: 47059 real time: 13617 gc time: 21499

I think we can fork the check syntax library and do some modification. It would solve some performance issues and have finer controls.

But this might requires quite much rewrite of code. Another big obstacle of implementing these features is that we don't a proper abstraction layer. Every new feature always need to reimplement based on the interfaces of check syntax library.

@dannypsnl
Copy link
Contributor Author

Another idea I have is checking the #lang line, so

  1. #lang racket/base, #lang racket racket series
  2. others

For racket series, we can implement a tree-sitter based tracker to get what symbols are provided via file and whose using them. There are two major problems

  1. jumps to keyword will not work in this implementation
  2. racket's provide can be wild, everyone can re-implement sub-provide forms, those won't be catched

and even in this case, an abstraction still needed. Just record a possibility, modify check syntax library will be long term better solution I thought.

@dannypsnl
Copy link
Contributor Author

@6cdh
Copy link
Contributor

6cdh commented Sep 7, 2024

I think tree sitter does not fit our case. Because people can modify read table everywhere or require another module which modify read table at run time. Tree sitter can't process dynamic behavior.

It limits us to only use the builtin read functions.

And even worse, macro can also happen everywhere. For example, struct and class are macros, they are very common. These macros create new bindings like struct_name-field_name, set-struct_name-field_name! and can be provided through (struct-out struct_name). If we just look at the ast, we would not find where did them come from. We need to expand macros before.

It limits us to only use the builtin expand functions.

If we use ast method, read functions is more suitable than tree sitter. Because read functions are also very fast and can process read tables, while tree sitter means a C dependency. The only advantage of tree sitter here is its error tolerance. But in a multiple files project, if one is working on a file, we can generally suppose other files are correct or at least correct for readers.

I agree with the racket series part. We need to special handle racket.

I hope fork and modify check syntax library is not too difficult. I saw it has 2700 lines of code. Although I don't have time to do it this year :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants