-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Oryp7 unhealthy jumpy thermals during gaming #224
Comments
Have you tested with self-built firmware from this repository, or are you on the currently published firmware? There have already been improvements made to cooling behavior since the current firmware was published, including fan speed ramp-up/ramp-down (with a corresponding decrease in reaction time) and syncing the CPU and GPU fans together since the heatsinks are connected. If you don't want to build and flash firmware yourself, these improvements will be part of an upcoming regular firmware update that is currently being tested. |
I have not flashed the firmware myself as I'm waiting for an official update, so I'm running the current "stable" version. Yes, I'm well aware that you guys smoothed the cooling curves, but I'm not sure if it will fix this problem. Here's my reasoning:
|
This is why the reaction time was reduced, as I mentioned. The delay you currently see was to prevent short bursts of high fan speed. That's no longer as much of an issue with smoothing, so the fans will respond slightly quicker. A 10-second "running maximum" would imply a 10-second delay in response for decreases in temperature. Using an average instead of the current temperature would also add more delay, as it would take longer for the average to rise/fall than the actual temperature. The smoothing + slight remaining delay created something that felt like an average last time I tested it, so I'd recommend you wait and try it out. The upcoming firmware update is for all Open Firmware laptops and will still be in testing for a little while, so if you want to give it a try now, you can install Rust using the command on this website and then build/flash on your Oryx Pro using these commands:
The flashing script will power off your machine, so save any work you have open before running it. As long as you remain plugged into the charger through this entire process, it should be fairly low-risk. Once you're on the self-built firmware, your Oryx Pro will prompt you for a firmware "update," which you can install at any time through the GUI to go back to the regular published firmware. If you find that the fan behavior is still not satisfactory, this issue should probably be transferred to the EC repository, since that is where most of the work on fan behavior happens. There's also some discussion about fan curves here: system76/ec#180 |
Maybe I didn't explain it clearly, but a 10-second running maximum keeps track of maximum temperature that was encountered during the last 10 seconds. So if you suddenly get a spike in temperature, then this running maximum will immediately assume this value and it will last for the next 10 seconds unless another spike occurs. I think I will after all give the master version of the firmware a try, thanks for the guide. |
So I ran the new firmware. I did not expect it to take 3GB of download, half an hour of build time, and the need to switch to linux to build it, but that's beside the point. I definitely noticed that it has fan curves that the majority of users will see as a huge improvement. However, my problem still persists: the temps still do hit 90-95C in very sudden spikes and fans take 2-3 seconds to spin up when that happens. But after thinking some more about it, I came to a conclusion that the problem seems to be more in thermal capacity of the CPU heat sink rather than software that controls the fans. I think temperature should not be able to spike from 60 to 95 in literally one second. I've never seen this problem in any other laptop, because typically heat dissipation is the main problem for most models when fans are just not strong enough and can't dissipate enough thermal energy. But it's definitely not the case here: when fans spin, they do keep the temp impressively low while producing impressively little noise. It seems to be specifically an issue of low thermal capacity and specifically of the CPU heatsink. |
FWIW, I observe similar spikes in CPU temperature on 2015 15" MacBook Pro, so it's far from unique to these Clevo laptops. Probably more of an Intel + lighter laptop thing (most laptops). |
Can this issue be closed? |
So I ran the new firmware a fair bit. The problem still reproduces but in a different behaviour. Basically the thermals during gaming stay around 80-85C most of the time, which is I guess acceptable for games that don't support framerate limiting. But what happens occasionally is that temp very quickly and very unstably jumps to 90-95 and then goes back in literally a second. I don't think it's reasonable to react to this by adjusting the fan speed when this happens, it just happens way too fast. Instead, I'd prefer my fans to just overcool my CPU when I'm gaming so that (roughly speaking) the temp jumps from 75 to 85 instead of jumping from 85 to 95. But there's no way currently to tell that my desired temp is actually 75 and I don't mind my coolers working harder when I'm gaming in anticipation of spikes rather than trying to react to those spikes. |
@Raikiri I wonder if maybe the thermal paste was not applied correctly in the factory. |
@jackpot51 But maybe it's not the reason and I'm wrong. I can attempt to repaste my laptop, but I never tried it before, so will need to do plenty of research to make sure I don't mess anything up. |
@Raikiri If you haven't found it already, the oryp7 tech-docs may be helpful: https://tech-docs.system76.com/models/oryp7/repairs.html#replacing-the-cooling-system |
@Raikiri not much to be done on the heatsink capacity, but you can easily modify your fan curve in firmware to hit 100% fans at 80C for example. Won't prevent temperature spikes of course, but should help if you don't mind the noise. This is the fan curve I run: https://github.com/curiousercreative/ec/blob/galp5/src/board/system76/galp5/board.mk#L44. Breakpoints are to match thermal targets set by system76-power profiles, which probably aren't available in Windows. |
I follow these build instructions and this is the errors my Oryp8 spits out. I would really like to understand why I cannot build the firmware. I was able to build it just fine last month. Different install but same PC. |
still having issues trying to build the firmware
|
@Localacct21 please try using a multi-line code block for that. To do that, triple back tick (`) to open and again to close, so: When I attempt to follow these instructions, I receive a different error. This may be resolved by running apt upgrade, but I'm holding off on that kernel upgrade as I run ZFS on this system:
|
@curiousercreative If you're intentionally not updating your system (which is not generally recommended), you can try just updating udev with We have recently started packaging ZFS and I know that it works with the kernel that we're currently shipping, at least for a basic partitioning setup; it's one of the things currently preventing us from releasing kernel 5.16. |
@jacobgkau oh hey, thanks for sharing the ZFS update. I wasn't planning to hold back long, thought it'd be a few days. I was keeping an eye on this issue which can probably be updated and closed: pop-os/pop#2032 |
@Raikiri I had exactly the same issue. Two things:
As for paste, I did see improvements by repasting; but that'd be true for most laptops. My intuition is very much that your analysis of the CPU heatsink just not having enough capacity is correct. |
glad that I found this github issue as I'm trying to solve the same cooling problem on my Oryp6 (which has almost the same internal structure and cooling system as Oryp7) I recently re-pasted it with arctic silver 5, with single-digit percentage on CPU loading (web-gaming using the 2080) the CPU temp. can still go up to I'm not sure if it's more about the heatsink's capacity or putting not enough paste on it (buttered toast, thin but enough to cover the die), but I'm planning to try Kryonaut and mod the cooling system (adding water cooling pipe onto it) at the same time. Wonder if anyone can provide the measure of the gap thickness between the pipes and the bottom chassis, then I can start modding it like the water cooling system on Eluktronics notebook. |
@Raikiri Try replacing the stock thermal pads with copper shims. Also try thermal-taping additional copper to the top of the heatsink. Then put a thermal pad on-top of that, so that it touches the aluminium chassis. That will add a shitload of extra thermal capacity, and even increase active cooling (since the chassis is air-cooled by the fans). Also install Fact of the matter is: there is no universe in which that tiny amount of heatsink (as shown in the picture) is going to be adequate to cool an i7. The only solution is to add more heatsink. Repasting, as you did, only increases the rate of thermal transfer from the CPU; that only helps if there's actually somewhere for that thermal energy to go. You should also look into flashing your own custom fan curve to the EC; the default one is terrible, and doesn't reach 100% until 90ºC, long after the CPU has already started thermal-throttling. |
I'm running Windows 10 with Libre Hardware Monitor to track temps of my CPU and GPU. I noticed that during gaming CPU thermals jump wildly from 60C to as high as 90C sometimes multiple times during 10 seconds.
Usually it happens when stuff is happening in a game that suddenly spikes temperature to 85C+ in a matter of seconds while the fans are practically idling. After a second fans start blasting like nobody's business until temp drops to below 60C in 3 seconds or so. Then they practically turn off again thinking that their job is done and the cycle obviously repeats.
I have two issues here:
Sometimes I see temperature jump in Libre Monitor to 85C and the fans are still idling, sometimes requiring more than a couple seconds to "react". Well first, how is it even possible that the temperature changes so quickly? But if it does, I think there should be no smoothing applied to the thermal curves and the fans should be blasting full speed before a meatbag like myself can even notice it in a 3rd party temp monitor program.
When I'm gaming, I want my lowest RPM to be at the minimal level that sustains acceptable temperature long term not just momentarily. For example, if a temperature momentarily drops below 60C, it does not mean that the fans should be turned off, if during the last half a minute they were spinning at 3k RMP and the temperature was 65C. What i'm saying is, it should consider some sort of moving average rather than reading momentary temperature and then trying to smooth the result (which I believe it currently does not even attempt).
PS the issue mostly affects CPU temperature, because GPU temperature seems to be way less jumpy in comparison. It usually takes at least 5-10 seconds for the GPU to reach a high temperature and the cooling system has enough time to react, but CPU jumps to very high temperature very quickly.
The text was updated successfully, but these errors were encountered: