1170613Sbms========================
2189592SbmsHow to get s2ram working
3170613Sbms========================
4170613Sbms
5170613Sbms2006 Linus Torvalds
6170613Sbms2006 Pavel Machek
7170613Sbms
8170613Sbms1) Check suspend.sf.net, program s2ram there has long whitelist of
9170613Sbms   "known ok" machines, along with tricks to use on each one.
10170613Sbms
11170613Sbms2) If that does not help, try reading tricks.txt and
12170613Sbms   video.txt. Perhaps problem is as simple as broken module, and
13170613Sbms   simple module unload can fix it.
14170613Sbms
15170613Sbms3) You can use Linus' TRACE_RESUME infrastructure, described below.
16170613Sbms
17170613SbmsUsing TRACE_RESUME
18170613Sbms~~~~~~~~~~~~~~~~~~
19170613Sbms
20170613SbmsI've been working at making the machines I have able to STR, and almost
21170613Sbmsalways it's a driver that is buggy. Thank God for the suspend/resume
22170613Sbmsdebugging - the thing that Chuck tried to disable. That's often the _only_
23170613Sbmsway to debug these things, and it's actually pretty powerful (but
24170613Sbmstime-consuming - having to insert TRACE_RESUME() markers into the device
25170613Sbmsdriver that doesn't resume and recompile and reboot).
26170613Sbms
27170613SbmsAnyway, the way to debug this for people who are interested (have a
28170613Sbmsmachine that doesn't boot) is:
29170613Sbms
30170613Sbms - enable PM_DEBUG, and PM_TRACE
31170613Sbms
32170613Sbms - use a script like this::
33170613Sbms
34170613Sbms	#!/bin/sh
35170613Sbms	sync
36170613Sbms	echo 1 > /sys/power/pm_trace
37170613Sbms	echo mem > /sys/power/state
38189106Sbz
39189106Sbz   to suspend
40170613Sbms
41170613Sbms - if it doesn't come back up (which is usually the problem), reboot by
42170613Sbms   holding the power button down, and look at the dmesg output for things
43170613Sbms   like::
44170613Sbms
45171746Scsjp	Magic number: 4:156:725
46170613Sbms	hash matches drivers/base/power/resume.c:28
47170613Sbms	hash matches device 0000:01:00.0
48189592Sbms
49170613Sbms   which means that the last trace event was just before trying to resume
50181803Sbz   device 0000:01:00.0. Then figure out what driver is controlling that
51189592Sbms   device (lspci and /sys/devices/pci* is your friend), and see if you can
52189592Sbms   fix it, disable it, or trace into its resume function.
53170613Sbms
54170613Sbms   If no device matches the hash (or any matches appear to be false positives),
55170613Sbms   the culprit may be a device from a loadable kernel module that is not loaded
56170613Sbms   until after the hash is checked. You can check the hash against the current
57185571Sbz   devices again after more modules are loaded using sysfs::
58170613Sbms
59170613Sbms	cat /sys/power/pm_trace_dev_match
60170613Sbms
61170613SbmsFor example, the above happens to be the VGA device on my EVO, which I
62170613Sbmsused to run with "radeonfb" (it's an ATI Radeon mobility). It turns out
63170613Sbmsthat "radeonfb" simply cannot resume that device - it tries to set the
64170613SbmsPLL's, and it just _hangs_. Using the regular VGA console and letting X
65185571Sbzresume it instead works fine.
66170613Sbms
67189592SbmsNOTE
68191659Sbms====
69189592Sbmspm_trace uses the system's Real Time Clock (RTC) to save the magic number.
70189592SbmsReason for this is that the RTC is the only reliably available piece of
71170613Sbmshardware during resume operations where a value can be set that will
72170613Sbmssurvive a reboot.
73170613Sbms
74170613Sbmspm_trace is not compatible with asynchronous suspend, so it turns
75170613Sbmsasynchronous suspend off (which may work around timing or
76170613Sbmsordering-sensitive bugs).
77170613Sbms
78170613SbmsConsequence is that after a resume (even if it is successful) your system
79170613Sbmsclock will have a value corresponding to the magic number instead of the
80170613Sbmscorrect date/time! It is therefore advisable to use a program like ntp-date
81170613Sbmsor rdate to reset the correct date/time from an external time source when
82189592Sbmsusing this trace option.
83189592Sbms
84170613SbmsAs the clock keeps ticking it is also essential that the reboot is done
85170613Sbmsquickly after the resume failure. The trace option does not use the seconds
86189592Sbmsor the low order bits of the minutes of the RTC, but a too long delay will
87189592Sbmscorrupt the magic value.
88170613Sbms