1170613Sbms======================== 2189592SbmsHow to get s2ram working 3170613Sbms======================== 4170613Sbms 5170613Sbms2006 Linus Torvalds 6170613Sbms2006 Pavel Machek 7170613Sbms 8170613Sbms1) Check suspend.sf.net, program s2ram there has long whitelist of 9170613Sbms "known ok" machines, along with tricks to use on each one. 10170613Sbms 11170613Sbms2) If that does not help, try reading tricks.txt and 12170613Sbms video.txt. Perhaps problem is as simple as broken module, and 13170613Sbms simple module unload can fix it. 14170613Sbms 15170613Sbms3) You can use Linus' TRACE_RESUME infrastructure, described below. 16170613Sbms 17170613SbmsUsing TRACE_RESUME 18170613Sbms~~~~~~~~~~~~~~~~~~ 19170613Sbms 20170613SbmsI've been working at making the machines I have able to STR, and almost 21170613Sbmsalways it's a driver that is buggy. Thank God for the suspend/resume 22170613Sbmsdebugging - the thing that Chuck tried to disable. That's often the _only_ 23170613Sbmsway to debug these things, and it's actually pretty powerful (but 24170613Sbmstime-consuming - having to insert TRACE_RESUME() markers into the device 25170613Sbmsdriver that doesn't resume and recompile and reboot). 26170613Sbms 27170613SbmsAnyway, the way to debug this for people who are interested (have a 28170613Sbmsmachine that doesn't boot) is: 29170613Sbms 30170613Sbms - enable PM_DEBUG, and PM_TRACE 31170613Sbms 32170613Sbms - use a script like this:: 33170613Sbms 34170613Sbms #!/bin/sh 35170613Sbms sync 36170613Sbms echo 1 > /sys/power/pm_trace 37170613Sbms echo mem > /sys/power/state 38189106Sbz 39189106Sbz to suspend 40170613Sbms 41170613Sbms - if it doesn't come back up (which is usually the problem), reboot by 42170613Sbms holding the power button down, and look at the dmesg output for things 43170613Sbms like:: 44170613Sbms 45171746Scsjp Magic number: 4:156:725 46170613Sbms hash matches drivers/base/power/resume.c:28 47170613Sbms hash matches device 0000:01:00.0 48189592Sbms 49170613Sbms which means that the last trace event was just before trying to resume 50181803Sbz device 0000:01:00.0. Then figure out what driver is controlling that 51189592Sbms device (lspci and /sys/devices/pci* is your friend), and see if you can 52189592Sbms fix it, disable it, or trace into its resume function. 53170613Sbms 54170613Sbms If no device matches the hash (or any matches appear to be false positives), 55170613Sbms the culprit may be a device from a loadable kernel module that is not loaded 56170613Sbms until after the hash is checked. You can check the hash against the current 57185571Sbz devices again after more modules are loaded using sysfs:: 58170613Sbms 59170613Sbms cat /sys/power/pm_trace_dev_match 60170613Sbms 61170613SbmsFor example, the above happens to be the VGA device on my EVO, which I 62170613Sbmsused to run with "radeonfb" (it's an ATI Radeon mobility). It turns out 63170613Sbmsthat "radeonfb" simply cannot resume that device - it tries to set the 64170613SbmsPLL's, and it just _hangs_. Using the regular VGA console and letting X 65185571Sbzresume it instead works fine. 66170613Sbms 67189592SbmsNOTE 68191659Sbms==== 69189592Sbmspm_trace uses the system's Real Time Clock (RTC) to save the magic number. 70189592SbmsReason for this is that the RTC is the only reliably available piece of 71170613Sbmshardware during resume operations where a value can be set that will 72170613Sbmssurvive a reboot. 73170613Sbms 74170613Sbmspm_trace is not compatible with asynchronous suspend, so it turns 75170613Sbmsasynchronous suspend off (which may work around timing or 76170613Sbmsordering-sensitive bugs). 77170613Sbms 78170613SbmsConsequence is that after a resume (even if it is successful) your system 79170613Sbmsclock will have a value corresponding to the magic number instead of the 80170613Sbmscorrect date/time! It is therefore advisable to use a program like ntp-date 81170613Sbmsor rdate to reset the correct date/time from an external time source when 82189592Sbmsusing this trace option. 83189592Sbms 84170613SbmsAs the clock keeps ticking it is also essential that the reboot is done 85170613Sbmsquickly after the resume failure. The trace option does not use the seconds 86189592Sbmsor the low order bits of the minutes of the RTC, but a too long delay will 87189592Sbmscorrupt the magic value. 88170613Sbms