Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix some Base.@propagate_inbounds annotations #155

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

cscherrer
Copy link

I noticed the compiler wasn't quite reducing things as it should, so here are some edits to (hopefully) improve performance. Thanks to @mbauman and @YingboMa for Slack discussion.

Suppose we have

julia> Z = Zeros(10)
10-element Zeros{Float64}

julia> f(z,j) = @inbounds z[j]
f (generic function with 1 method)

Then

Before

julia> @code_llvm f(Z,3)
;  @ REPL[5]:1 within `f`
define double @julia_f_2043([1 x [1 x [1 x i64]]]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i64 signext %1) #0 {
top:
  %2 = alloca [1 x i64], align 8
; ┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:42 within `getindex`
; │┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:37 within `_fill_getindex`
    %3 = getelementptr inbounds [1 x i64], [1 x i64]* %2, i64 0, i64 0
    store i64 %1, i64* %3, align 8
; ││┌ @ abstractarray.jl:659 within `checkbounds` @ abstractarray.jl:644
; │││┌ @ abstractarray.jl:718 within `checkindex`
; ││││┌ @ int.jl:471 within `<=`
       %4 = icmp slt i64 %1, 1
; ││││└
; ││││┌ @ range.jl:686 within `last`
; │││││┌ @ Base.jl:42 within `getproperty`
        %5 = getelementptr inbounds [1 x [1 x [1 x i64]]], [1 x [1 x [1 x i64]]]* %0, i64 0, i64 0, i64 0, i64 0
; ││││└└
; ││││┌ @ int.jl:471 within `<=`
       %6 = load i64, i64* %5, align 8
       %7 = icmp slt i64 %6, %1
; │││└└
; │││ @ abstractarray.jl:659 within `checkbounds`
     %8 = or i1 %4, %7
     br i1 %8, label %L12, label %L17

L12:                                              ; preds = %top
     %9 = call nonnull {}* @j_throw_boundserror_2045([1 x [1 x [1 x i64]]]* nocapture nonnull readonly %0, [1 x i64]* nocapture readonly %2) #0
     call void @llvm.trap()
     unreachable

L17:                                              ; preds = %top
; └└└
  ret double 0.000000e+00
}

After

julia> @code_llvm f(Z,3)
;  @ REPL[5]:1 within `f`
define double @julia_f_1017([1 x [1 x [1 x i64]]]* nocapture nonnull readonly align 8 dereferenceable(8) %0, i64 signext %1) #0 {
top:
  ret double 0.000000e+00
}

@dlfivefifty
Copy link
Member

Thanks! Any chance you can add tests?

@codecov
Copy link

codecov bot commented Aug 12, 2021

Codecov Report

Merging #155 (9878b6a) into master (269fd41) will not change coverage.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##           master     #155   +/-   ##
=======================================
  Coverage   99.21%   99.21%           
=======================================
  Files           4        4           
  Lines         633      633           
=======================================
  Hits          628      628           
  Misses          5        5           
Impacted Files Coverage Δ
src/FillArrays.jl 99.68% <100.00%> (ø)

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@cscherrer
Copy link
Author

cscherrer commented Aug 12, 2021

Only way I can think of offhand is to redirect stdout, capture it in a string and check its length. But I'm not sure how to do that. Any ideas?

I mean, there's

help?> redirect_stdout
search: redirect_stdout redirect_stdio redirect_stdin redirect_stderr

  redirect_stdout([stream]) -> stream

  Create a pipe to which all C and Julia level stdout output will be redirected. Return a stream representing the pipe ends. Data written to
  stdout may now be read from the rd end of the pipe.

  │ Note
  │
  │  stream must be a compatible objects, such as an IOStream, TTY, Pipe, socket, or devnull.

  See also redirect_stdio.

  ───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────

  redirect_stdout(f::Function, stream)

  Run the function f while redirecting stdout to stream. Upon completion, stdout is restored to its prior setting.

TTY looks like closest, but I have no idea how to create one

help?> Base.TTY
  No documentation found.

  Summary
  ≡≡≡≡≡≡≡≡≡

  mutable struct Base.TTY

  Fields
  ≡≡≡≡≡≡≡≡

  handle    :: Ptr{Nothing}
  status    :: Int64
  buffer    :: IOBuffer
  cond      :: Base.GenericCondition{Base.Threads.SpinLock}
  readerror :: Any
  sendbuf   :: Union{Nothing, IOBuffer}
  lock      :: ReentrantLock
  throttle  :: Int64

  Supertype Hierarchy
  ≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡≡

  Base.TTY <: Base.LibuvStream <: IO <: Any

@cscherrer
Copy link
Author

cscherrer commented Aug 12, 2021

@dlfivefifty Maybe something like this?

julia> using FillArrays, Test

julia> function llvm_lines(a)
           f(x,j) = @inbounds x[j]
           io = IOBuffer()
           code_llvm(io, f, (typeof(a), Int))
           # countlines(io)
           count(==('\n'), String(take!(io)))
       end
llvm_lines (generic function with 1 method)

julia> @test llvm_lines(Zeros(10)) < 10
Test Passed
  Expression: llvm_lines(Zeros(10)) < 10
   Evaluated: 5 < 10

julia> @test llvm_lines(Ones(10)) < 10
Test Passed
  Expression: llvm_lines(Ones(10)) < 10
   Evaluated: 5 < 10

julia> @test llvm_lines(Fill(2.0,10)) < 10
Test Failed at REPL[65]:1
  Expression: llvm_lines(Fill(2.0, 10)) < 10
   Evaluated: 12 < 10
ERROR: There was an error during testing

Need to check on that last one again.

EDIT: Mostly it's just comments, looks ok

julia> f(z,j) = @inbounds z[j]
f (generic function with 1 method)

julia> @code_llvm f(Fill(2.0, 10), 3)
;  @ REPL[66]:1 within `f`
define double @julia_f_1329({ double, [1 x [1 x i64]] }* nocapture nonnull readonly align 8 dereferenceable(16) %0, i64 signext %1) #0 {
top:
; ┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:42 within `getindex`
; │┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:38 within `_fill_getindex`
; ││┌ @ /home/chad/git/FillArrays.jl/src/FillArrays.jl:127 within `getindex_value`
; │││┌ @ Base.jl:42 within `getproperty`
      %2 = getelementptr inbounds { double, [1 x [1 x i64]] }, { double, [1 x [1 x i64]] }* %0, i64 0, i32 0
; └└└└
  %3 = load double, double* %2, align 8
  ret double %3
}

@cscherrer
Copy link
Author

Oh and I can make debuginfo=:none, then we get 6 lines at most.

@cscherrer
Copy link
Author

Hmm, this works interactively but fails in the test environment with

Inbounds optimization: Test Failed at /home/chad/git/FillArrays.jl/test/runtests.jl:1337
  Expression: llvm_lines(Zeros(10)) < 10
   Evaluated: 20 < 10
Stacktrace:
 [1] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
 [2] macro expansion
   @ ~/git/FillArrays.jl/test/runtests.jl:1337 [inlined]
 [3] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1282 [inlined]
 [4] top-level scope
   @ ~/git/FillArrays.jl/test/runtests.jl:1329
Inbounds optimization: Test Failed at /home/chad/git/FillArrays.jl/test/runtests.jl:1338
  Expression: llvm_lines(Ones(10)) < 10
   Evaluated: 20 < 10
Stacktrace:
 [1] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
 [2] macro expansion
   @ ~/git/FillArrays.jl/test/runtests.jl:1338 [inlined]
 [3] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1282 [inlined]
 [4] top-level scope
   @ ~/git/FillArrays.jl/test/runtests.jl:1329
Inbounds optimization: Test Failed at /home/chad/git/FillArrays.jl/test/runtests.jl:1339
  Expression: llvm_lines(Fill(2.0, 10)) < 10
   Evaluated: 22 < 10
Stacktrace:
 [1] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:445 [inlined]
 [2] macro expansion
   @ ~/git/FillArrays.jl/test/runtests.jl:1339 [inlined]
 [3] macro expansion
   @ /buildworker/worker/package_linux64/build/usr/share/julia/stdlib/v1.7/Test/src/Test.jl:1282 [inlined]
 [4] top-level scope
   @ ~/git/FillArrays.jl/test/runtests.jl:1329
Test Summary:         | Fail  Total
Inbounds optimization |    3      3
ERROR: LoadError: Some tests did not pass: 0 passed, 3 failed, 0 errored, 0 broken.
in expression starting at /home/chad/git/FillArrays.jl/test/runtests.jl:1328
ERROR: Package FillArrays errored during testing

I guess ]test must use a lower optimization setting? Not sure how we can test in that case.

@YingboMa
Copy link

You'd need PerformanceTestTools.jl to actually disable bounds checks in test.

@dlfivefifty
Copy link
Member

Maybe it's not worth the hassle and we just merge?

@cscherrer
Copy link
Author

Let me take one more pass at it. If this doesn't work I'll roll it back to just the code updates

@cscherrer
Copy link
Author

And thanks @YingboMa , I hadn't heard of that one :)

@cscherrer
Copy link
Author

Nope, no luck. I left the code commented out in case we can come back to it

@YingboMa
Copy link

That's not how you use PerformanceTestTools.jl. See https://github.com/YingboMa/FastBroadcast.jl/blob/master/test/runtests.jl#L84

test/runtests.jl Outdated Show resolved Hide resolved
Co-authored-by: Matt Bauman <[email protected]>
Comment on lines +36 to 39
Base.@propagate_inbounds function _fill_getindex(F::AbstractFill, kj::Integer...)
@boundscheck checkbounds(F, kj...)
getindex_value(F)
end
Copy link
Contributor

@mcabbott mcabbott Aug 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this just be @inbounds? I thought that a function which deals with checkbounds need not propagate anything further, and only functions like getindex(F::AbstractFill, k::Integer) = _fill_getindex(F, k) need to do so. Like the example here:

https://docs.julialang.org/en/v1/devdocs/boundscheck/#Eliding-bounds-checks

(But 5 mins of trying to @eval things to test didn't tell me anything, so I am not 100% sure.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants