• Antergos HANGS itself every one/two days


    @Quantum_Sniper said in Antergos HANGS itself every one/two days:

    dmesg and cat results in the terminal are too huge and am unable to put it in a single post.

    https://hastebin.com/

  • @joekamprad Thanks for the timely help :)

    dmesg
    https://ptpb.pw/ub3S

  • Looking you pros go about the situation makes me feel like a total NOOB now. I don’t really know anything about Linux.

  • cat /var/log/Xorg.0.log looks good

    dmesg → →

    ACPI FADT declares the system doesn’t support PCIe ASPM, so disable it
    https://lwn.net/Articles/449448/
    https://sourceforge.net/p/e1000/bugs/100/#a5be

    [ 2872.774559] nouveau 0000:08:00.0: fifo: PIO_ERROR
    https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1474538

    blacklist-nouveau

    EDIT: There errors ignored, I did not find much in the logs.
    I can not define what is basically the freezing error of the system.
    There enough information, if someone else wants to go in search of the problem, well better.

  • @judd How do I disable it then? Any commands?

  • @judd Judd, I am going to sleep now. It’s 3:36 AM here (haven’t slept). I will check back here tomorrow morning. Thanks for your help bro. The log files really need experienced ppl to check it. It’s too huge for me to even like know where to begin. Thanks and see u in the morning.

  • @Quantum_Sniper said in Antergos HANGS itself every one/two days:

    @judd How do I disable it then? Any commands?

    It’s the first time I see this, we have to investigate more …

  • @Quantum_Sniper said in Antergos HANGS itself every one/two days:

    @judd Judd, I am going to sleep now. It’s 3:36 AM here (haven’t slept). I will check back here tomorrow morning. Thanks for your help bro. The log files really need experienced ppl to check it. It’s too huge for me to even like know where to begin. Thanks and see u in the morning.

    Have a good rest.

  • @Quantum_Sniper

    [ACPI FADT declares the system doesn’t support PCIe ASPM, so disable it]
    https://en.wikipedia.org/wiki/Active_State_Power_Management

    It’s power management and this would normally be a red-herring simply stating something isn’t supported but due to your problem, I would try disabling it (it will be in BIOS or PM SW). I would be surprised if it’s individually listed, so try disabling PM completely and you can then prove / disprove PM.

  • @judd Sorry for my late response, I was without Internet connection for the past 2 days. My ISP is currently having some maintenance problems and so my internet access is going to be unreliable for the current time.

    @judd thanks for providing me the links. It must have taken a LOT of work to search the forums. I went through the links you provided. But this link which you provided, I felt was the most relevant and relatable. I am facing the same problem as the people who posted in that thread are facing. But the only difference is that my situation is not that bad (thankfully) as theirs. There are people in that thread whose system crashes/freezes every few hours. There are people saying that their system crashes everytime they overload their system (multiple tabs in browsers, virtual machines, editor, games etc), some say that their system crashes everytime they open YouTube videos or something video related. Everybody on the forum were like… even I have the same problem, me too , I too have the same problem…but none of them had an exact idea of why their problem was occurring. My laptop model, Dell Inspiron 15 i5 users were also present and they had the same problem too, but as I said before, no concrete evidence was found.

    The thread users also do many circus stuff, try a lot of stuff. They were discussing things which I have no idea of. Finally they resorted to changing the kernel parameters

    • intel_idle.max_cstate=1 kernel parameter

    The above change helped some people while it did not for others.

    Still don’t know how to go forward.

    Man, after going through Arch forums, I feel like the Arch users are scientists/researchers.

  • @judd , @robgriff444 Isn’t Active State Power Management(ASPM) good? I read that it decreases the power usage by bringing the system to a low power state but it increases the latency in the process. I also read that disabling ASPM results in increased usage of memory and results in decreased performance. This sucks. I need long battery life for my Laptop. Currently my laptop battery lasts for around 6 - 6.5 hours. No other distro gives me this much battery life and it is partly why I use Antergos. Debian comes closest to Antergos in this regard and gives about 6 hrs of battery life. I have enabled the TLP.

    $ sudo tlp stat

    [[email protected] ~]$ sudo tlp stat
    [sudo] password for darshan: 
    --- TLP 1.1 --------------------------------------------
    
    +++ Configured Settings: /etc/default/tlp
    TLP_ENABLE=1
    TLP_DEFAULT_MODE=AC
    TLP_PERSISTENT_DEFAULT=0
    DISK_IDLE_SECS_ON_AC=0
    DISK_IDLE_SECS_ON_BAT=2
    MAX_LOST_WORK_SECS_ON_AC=15
    MAX_LOST_WORK_SECS_ON_BAT=60
    CPU_HWP_ON_AC=balance_performance
    CPU_HWP_ON_BAT=balance_power
    SCHED_POWERSAVE_ON_AC=0
    SCHED_POWERSAVE_ON_BAT=1
    NMI_WATCHDOG=0
    ENERGY_PERF_POLICY_ON_AC=performance
    ENERGY_PERF_POLICY_ON_BAT=power
    DISK_DEVICES="sda sdb"
    DISK_APM_LEVEL_ON_AC="254 254"
    DISK_APM_LEVEL_ON_BAT="128 128"
    SATA_LINKPWR_ON_AC="med_power_with_dipm max_performance"
    SATA_LINKPWR_ON_BAT="med_power_with_dipm min_power"
    AHCI_RUNTIME_PM_TIMEOUT=15
    PCIE_ASPM_ON_AC=performance
    PCIE_ASPM_ON_BAT=powersave
    RADEON_POWER_PROFILE_ON_AC=high
    RADEON_POWER_PROFILE_ON_BAT=low
    RADEON_DPM_STATE_ON_AC=performance
    RADEON_DPM_STATE_ON_BAT=battery
    RADEON_DPM_PERF_LEVEL_ON_AC=auto
    RADEON_DPM_PERF_LEVEL_ON_BAT=auto
    WIFI_PWR_ON_AC=off
    WIFI_PWR_ON_BAT=on
    WOL_DISABLE=Y
    SOUND_POWER_SAVE_ON_AC=0
    SOUND_POWER_SAVE_ON_BAT=1
    SOUND_POWER_SAVE_CONTROLLER=Y
    BAY_POWEROFF_ON_AC=0
    BAY_POWEROFF_ON_BAT=0
    BAY_DEVICE="sr0"
    RUNTIME_PM_ON_AC=on
    RUNTIME_PM_ON_BAT=auto
    USB_AUTOSUSPEND=1
    USB_BLACKLIST_BTUSB=0
    USB_BLACKLIST_PHONE=0
    USB_BLACKLIST_PRINTER=1
    USB_BLACKLIST_WWAN=1
    RESTORE_DEVICE_STATE_ON_STARTUP=0
    
    +++ System Info
    System         = Dell Inc. A10 Inspiron 3543
    BIOS           = A10
    Kernel         = 4.20.0-arch1-1-ARCH #1 SMP PREEMPT Mon Dec 24 03:00:40 UTC 2018 x86_64
    /proc/cmdline  = BOOT_IMAGE=/boot/vmlinuz-linux root=UUID=d37ca417-83a0-4c74-9a5e-13d49b7296a2 rw quiet resume=UUID=128be186-1e22-4308-abad-c00bc1680a87
    Init system    = systemd 
    Boot mode      = UEFI
    
    +++ TLP Status
    State          = enabled
    Last run       = 10:29:47 PM,   4685 sec(s) ago
    Mode           = battery
    Power source   = battery
    
    +++ Processor
    CPU model      = Intel(R) Core(TM) i5-5200U CPU @ 2.20GHz
    
    /sys/devices/system/cpu/cpu0/cpufreq/scaling_driver    = intel_pstate
    /sys/devices/system/cpu/cpu0/cpufreq/scaling_governor  = powersave
    /sys/devices/system/cpu/cpu0/cpufreq/scaling_available_governors = performance powersave
    /sys/devices/system/cpu/cpu0/cpufreq/scaling_min_freq  =   500000 [kHz]
    /sys/devices/system/cpu/cpu0/cpufreq/scaling_max_freq  =  2700000 [kHz]
    
    /sys/devices/system/cpu/cpu1/cpufreq/scaling_driver    = intel_pstate
    /sys/devices/system/cpu/cpu1/cpufreq/scaling_governor  = powersave
    /sys/devices/system/cpu/cpu1/cpufreq/scaling_available_governors = performance powersave
    /sys/devices/system/cpu/cpu1/cpufreq/scaling_min_freq  =   500000 [kHz]
    /sys/devices/system/cpu/cpu1/cpufreq/scaling_max_freq  =  2700000 [kHz]
    
    /sys/devices/system/cpu/cpu2/cpufreq/scaling_driver    = intel_pstate
    /sys/devices/system/cpu/cpu2/cpufreq/scaling_governor  = powersave
    /sys/devices/system/cpu/cpu2/cpufreq/scaling_available_governors = performance powersave
    /sys/devices/system/cpu/cpu2/cpufreq/scaling_min_freq  =   500000 [kHz]
    /sys/devices/system/cpu/cpu2/cpufreq/scaling_max_freq  =  2700000 [kHz]
    
    /sys/devices/system/cpu/cpu3/cpufreq/scaling_driver    = intel_pstate
    /sys/devices/system/cpu/cpu3/cpufreq/scaling_governor  = powersave
    /sys/devices/system/cpu/cpu3/cpufreq/scaling_available_governors = performance powersave
    /sys/devices/system/cpu/cpu3/cpufreq/scaling_min_freq  =   500000 [kHz]
    /sys/devices/system/cpu/cpu3/cpufreq/scaling_max_freq  =  2700000 [kHz]
    
    /sys/devices/system/cpu/intel_pstate/min_perf_pct      =  18 [%]
    /sys/devices/system/cpu/intel_pstate/max_perf_pct      = 100 [%]
    /sys/devices/system/cpu/intel_pstate/no_turbo          =   0
    /sys/devices/system/cpu/intel_pstate/turbo_pct         =  27 [%]
    /sys/devices/system/cpu/intel_pstate/num_pstates       =  23
    
    x86_energy_perf_policy.cpu0                            = power 
    x86_energy_perf_policy.cpu1                            = power 
    x86_energy_perf_policy.cpu2                            = power 
    x86_energy_perf_policy.cpu3                            = power 
    
    /sys/module/workqueue/parameters/power_efficient       = Y
    /proc/sys/kernel/nmi_watchdog                          = 0
    
    +++ Undervolting
    PHC kernel not available.
    
    +++ Temperatures
    CPU temp               =    39 [°C]
    Fan speed (fan1)       =     0 [/min]
    
    +++ File System
    /proc/sys/vm/laptop_mode               =     2
    /proc/sys/vm/dirty_writeback_centisecs =  6000
    /proc/sys/vm/dirty_expire_centisecs    =  6000
    /proc/sys/vm/dirty_ratio               =    20
    /proc/sys/vm/dirty_background_ratio    =    10
    
    +++ Storage Devices
    /dev/sda:
      Model     = ST1000LM024 HN-M101MBB                  
      Firmware  = 2BA30004
      APM Level = 128
      Status    = active/idle
      Scheduler = mq-deadline
    
      Runtime PM: control = on, autosuspend_delay =   -1
    
    
    +++ AHCI Link Power Management (ALPM)
    /sys/class/scsi_host/host0/link_power_management_policy  = med_power_with_dipm
    /sys/class/scsi_host/host1/link_power_management_policy  = med_power_with_dipm
    /sys/class/scsi_host/host2/link_power_management_policy  = med_power_with_dipm
    /sys/class/scsi_host/host3/link_power_management_policy  = med_power_with_dipm
    
    +++ AHCI Host Controller Runtime Power Management
    /sys/bus/pci/devices/0000:00:1f.2/ata1/power/control = on
    /sys/bus/pci/devices/0000:00:1f.2/ata2/power/control = on
    /sys/bus/pci/devices/0000:00:1f.2/ata3/power/control = on
    /sys/bus/pci/devices/0000:00:1f.2/ata4/power/control = on
    
    +++ PCIe Active State Power Management
    /sys/module/pcie_aspm/parameters/policy = default (using bios preferences)
    
    +++ Intel Graphics
    /sys/module/i915/parameters/enable_dc        = -1 (use per-chip default)
    /sys/module/i915/parameters/enable_fbc       =  1 (enabled)
    /sys/module/i915/parameters/enable_psr       = -1 (use per-chip default)
    /sys/module/i915/parameters/modeset          =  1 (enabled)
    
    +++ Wireless
    bluetooth = on
    wifi      = on
    wwan      = none (no device)
    
    hci0(btusb)                   : bluetooth, not connected
    wlp6s0(wl)                    : wifi, connected, power management = on
    
    +++ Audio
    /sys/module/snd_hda_intel/parameters/power_save            = 1
    /sys/module/snd_hda_intel/parameters/power_save_controller = Y
    
    +++ Runtime Power Management
    Device blacklist = (not configured)
    Driver blacklist = amdgpu nouveau nvidia radeon (default)
    
    /sys/bus/pci/devices/0000:00:00.0/power/control = auto (0x060000, Host bridge, bdw_uncore)
    /sys/bus/pci/devices/0000:00:02.0/power/control = auto (0x030000, VGA compatible controller, i915)
    /sys/bus/pci/devices/0000:00:03.0/power/control = auto (0x040300, Audio device, snd_hda_intel)
    /sys/bus/pci/devices/0000:00:14.0/power/control = auto (0x0c0330, USB controller, xhci_hcd)
    /sys/bus/pci/devices/0000:00:16.0/power/control = auto (0x078000, Communication controller, mei_me)
    /sys/bus/pci/devices/0000:00:1b.0/power/control = auto (0x040300, Audio device, snd_hda_intel)
    /sys/bus/pci/devices/0000:00:1c.0/power/control = auto (0x060400, PCI bridge, pcieport)
    /sys/bus/pci/devices/0000:00:1c.2/power/control = auto (0x060400, PCI bridge, pcieport)
    /sys/bus/pci/devices/0000:00:1c.3/power/control = auto (0x060400, PCI bridge, pcieport)
    /sys/bus/pci/devices/0000:00:1c.4/power/control = auto (0x060400, PCI bridge, pcieport)
    /sys/bus/pci/devices/0000:00:1d.0/power/control = auto (0x0c0320, USB controller, ehci-pci)
    /sys/bus/pci/devices/0000:00:1f.0/power/control = auto (0x060100, ISA bridge, lpc_ich)
    /sys/bus/pci/devices/0000:00:1f.2/power/control = auto (0x010601, SATA controller, ahci)
    /sys/bus/pci/devices/0000:00:1f.3/power/control = auto (0x0c0500, SMBus, i801_smbus)
    /sys/bus/pci/devices/0000:06:00.0/power/control = auto (0x028000, Network controller, wl)
    /sys/bus/pci/devices/0000:07:00.0/power/control = auto (0x020000, Ethernet controller, r8169)
    /sys/bus/pci/devices/0000:08:00.0/power/control = auto (0x030200, 3D controller, nouveau)
    
    +++ USB
    Autosuspend         = enabled
    Device whitelist    = (not configured)
    Device blacklist    = (not configured)
    Bluetooth blacklist = disabled
    Phone blacklist     = disabled
    WWAN blacklist      = enabled
    
    Bus 001 Device 006 ID 0bda:0129 control = auto, autosuspend_delay_ms =  2000 -- Realtek Semiconductor Corp. RTS5129 Card Reader Controller (rtsx_usb)
    Bus 001 Device 005 ID 04f3:031a control = on,   autosuspend_delay_ms =  2000 -- Elan Microelectronics Corp.  (usbhid)
    Bus 001 Device 004 ID 0a5c:21d7 control = auto, autosuspend_delay_ms =  2000 -- Broadcom Corp. BCM43142 Bluetooth 4.0 (btusb)
    Bus 001 Device 003 ID 0c45:670b control = auto, autosuspend_delay_ms =  2000 -- Microdia  (uvcvideo)
    Bus 001 Device 002 ID 8087:8001 control = auto, autosuspend_delay_ms =     0 -- Intel Corp.  (hub)
    Bus 001 Device 001 ID 1d6b:0002 control = auto, autosuspend_delay_ms =     0 -- Linux Foundation 2.0 root hub (hub)
    Bus 003 Device 001 ID 1d6b:0003 control = auto, autosuspend_delay_ms =     0 -- Linux Foundation 3.0 root hub (hub)
    Bus 002 Device 002 ID 046d:c535 control = on,   autosuspend_delay_ms =  2000 -- Logitech, Inc.  (usbhid)
    Bus 002 Device 001 ID 1d6b:0002 control = auto, autosuspend_delay_ms =     0 -- Linux Foundation 2.0 root hub (hub)
    
    +++ Battery Status
    /sys/class/power_supply/BAT0/manufacturer                   = SANYO
    /sys/class/power_supply/BAT0/model_name                     = DELL 4WY7C59
    /sys/class/power_supply/BAT0/cycle_count                    = (not supported)
    /sys/class/power_supply/BAT0/charge_full_design             =   2800 [mAh]
    /sys/class/power_supply/BAT0/charge_full                    =   2071 [mAh]
    /sys/class/power_supply/BAT0/charge_now                     =   1653 [mAh]
    /sys/class/power_supply/BAT0/current_now                    =    584 [mA]
    /sys/class/power_supply/BAT0/status                         = Discharging
    
    Charge                                                      =   79.8 [%]
    Capacity                                                    =   74.0 [%]
    
    +++ Suggestions
    * Install smartmontools for disk drive health info
    
    [[email protected] ~]$ 
    
    

    The link in my previous post also had people trying to disable TLP and stuff, but still the freezes didn’t stop. I don’t think the TLP is responsible for freezes. Used TLP on many distros and never had a problem/freeze.

  • @judd This https://lwn.net/Articles/449448/ you gave also gave me a brief idea of what ASPM does and some basic but important idea.

    I am quoting from the above article

    “In other words, sometimes the BIOS will tell the system that ASPM is not supported even though ASPM support is present; for added fun, the BIOS may enable ASPM on some devices (even though it says ASPM is not supported) before passing control to the kernel. There are reasons why operating system developers tend to hold BIOS developers in low esteem.”

    So there is no one true way to say that ASPM is present in the system or not? The article was published in 2011, I don’t know about the current situation.

  • Today I updated my systemd-240 and so far no hang/freeze whatsoever. It’s been smooth. This problem is very difficult to solve, almost like a research project. I really appreciate your help.

  • @Quantum_Sniper My reason for suggesting disabling PM was only to help diagnose… 1) it could prove the error in dmesg didn’t matter and 2) it could prove that it wasn’t something in the PM that was forcing some hardware component to fail, trust me this happens, and because your failure is intermittent (and PM is intermittent) it’s worth a try even if it only rules it out. I was a technical manager of a PC builder for years so troubleshooting HW was often about isolation: ticking things off a list by swapping out the simplest / cheapest / likely things whose order obviously depended on the fault. HW diagnosis on a PC isn’t very technical and I would usually just narrow things down as far as possible with the least messing. So in your case 1) not swap, 2) not PM, so CPU? HD errors, GPU (overheating). And in my experience, logs are rarely of any use if it’s hardware because the thing that records the log is also the thing that freezes.

  • @Quantum_Sniper said in Antergos HANGS itself every one/two days:

    Today I updated my systemd-240 and so far no hang/freeze whatsoever. It’s been smooth. This problem is very difficult to solve, almost like a research project. I really appreciate your help.

    The summary of this work of the people of the forum of Arch, is the closest to the solution, now well; it is quite possible that from the kernel 4.14 - 4.15 onwards there are energy problems up to the current 4.20 which more complications of the management of energy has, at least for several users of i5.

    Obviously I do not know why this is, since it is a very upstream issue.

    I think the shots go on this side intel_idle.max_cstate=1 or at least several have solved it with that parameter.

    Verify your machine, maybe in future updates, this may have improvement.
    If not, then continue researching and with a bit of luck to find the solution.

    They are only my two cents.

  • Some Update:

    Antergos hanged once more yesterday and I had enough of it. I can’t afford my laptop crashing during the middle of the work and it’s pretty dangerous to use it during important moments. So I was forced to move out and I am on Debian Stable 9.6 MATE now. I still love Antergos though but it’s just misbehaving with my hardware.

  • Been using Antergos for about 4 months now. Also, I noticed that I began spending more time with my system instead of studying. So yeah, in that way it was bad. And the updates were constant, not that it is bad but it is simply annoying.

    On Debian, I just concentrate on my studies and nothing more. No updates. I can update once in 3-4 months and the system is reliable but the software packages are old. I am content with using old software now, but as long as the software works, I am not bothered about the newness/oldness of software.

    Long story short, increased productivity and more time spent studying and academics. I still love Antergos, but I can’t afford to spend time debugging WiFi problems and stuff like that. Can’t afford to spend my life with all these constant updates.

    I don’t think I will use any rolling release linux for now. Thanks to Antergos community for helping me with all the problems I had with my system. There’s no way I could have solved the problems without the help of the community.

    Antergos’s strongest points are it’s easy installation and strong and friendly community. Thanks @joekamprad , @judd , @robgriff444 , @manuel , @Bryanpwo , @inffy . If I have forgotten any other members, forgive me. I thank the community members once again.

    I would conclude with “Prevention is better than cure”. Arch offers cure for the problem but Debian prevents the problem from occurring in the first place. I love both Arch and Debian btw. Two different beings.

  • @Quantum_Sniper Fair enough, I’d be of the same mind with your problems, good luck.

hangs25 itself4 Posts 45Views 524
Log in to reply
Bloom Email Optin Plugin

Looks like your connection to Antergos Community Forum was lost, please wait while we try to reconnect.