Jump to content

Photo

[Solved] Moksha Segfault


Best Answer Jeff, 27 January 2017 - 06:39 PM

Out of curiosity - does disabling the window geometry text in Window Display settings make the crashes stop for you?

oNKMVxs.png

I can't produce a lock on my system since disabling these. Going to look into fixing it, but for now this seems like a good work around.

Go to the full post


  • Please log in to reply
69 replies to this topic

#41 Jeff

Jeff

    Lead Developer

  • Developer
  • 12328 posts
  • LocationBloomington, IL

Posted 23 January 2017 - 09:05 PM

OK - apparently we need run the crash through valgrind as described on this wiki page: https://phab.enlight...rg/w/debugging/

 

I need you to follow the directions from here - but with the two valgrind options I listed instead of the ones on the wiki page. Sorry for the confusion.





A big thank you to everyone who contributes to Bodhi Linux


#42 Astroboy

Astroboy

    Member

  • Members
  • 323 posts
  • LocationZacatecas, Mexico

Posted 24 January 2017 - 11:58 PM

I'm lost.

 

Trying to follow the steps on the wiki page leads me nowhere, either valgrind starts a dozen lines and not capturing anything, or it does not work at all. Their recommended way to start valgrind from a console does not work at all. I don't know if the wiki has lost steps, or since it is kind of old, maybe their procedures are not current.

 

In the previous attempt I did:

 

In Terminology # 1:

 

Xephyr :2

 

 

in Terminology # 2:

 

DISPLAY=:2

enlightenment_start -valgrind=all --show-reachable=no --vgdb-error=0 > output.txt 2>&1



#43 Charles@Bodhi

Charles@Bodhi

    Old Faithful

  • Moderators
  • 4313 posts
  • LocationZeist, The Netherlands

Posted 25 January 2017 - 10:04 AM

I'm lost.

 

Trying to follow the steps on the wiki page leads me nowhere, either valgrind starts a dozen lines and not capturing anything, or it does not work at all. Their recommended way to start valgrind from a console does not work at all. I don't know if the wiki has lost steps, or since it is kind of old, maybe their procedures are not current.

 

In the previous attempt I did:

 

In Terminology # 1:

 

Xephyr :2

 

 

in Terminology # 2l:

 

DISPLAY=:2

enlightenment_start -valgrind=all --show-reachable=no --vgdb-error=0 > output.txt 2>&1

 

What happens when you change the last command a little bit, leaving out the =all ?

enlightenment_start -valgrind --show-reachable=no --vgdb-error=0 > output.txt 2>&1

Enjoy,

Charles


Medion S4216 Ultrabook, 4GB RAM, 1TB HDD, WIN 10 & Bodhi 2.4.0-64 & Bodhi 3.0.0-64 

Asus eeepc 901, 1GB RAM, 12 GB SSD, Bodhi 2.4.0-32 non-pae & Bodhi 3.0.0-32-Legacy


#44 Jeff

Jeff

    Lead Developer

  • Developer
  • 12328 posts
  • LocationBloomington, IL

Posted 25 January 2017 - 04:59 PM

Try this:

Terminal one:

Xephyr :2

Terminal Two:
 

export DISPLAY=:2
valgrind --show-reachable=no --vgdb-error=0 --tool=memcheck enlightenment

On an unrelated note - anyone else hate C as much as I do?



#45 Astroboy

Astroboy

    Member

  • Members
  • 323 posts
  • LocationZacatecas, Mexico

Posted 25 January 2017 - 10:03 PM

Try this:

Terminal one:

Xephyr :2

Terminal Two:
 

export DISPLAY=:2
valgrind --show-reachable=no --vgdb-error=0 --tool=memcheck enlightenment

 

Doing this way produces the same output as the way suggested in the wiki: only produces a few lines and nothing happens, no nested session appears:


linux@linux-Dell-Precision-M3800:~$ valgrind --show-reachable=no --vgdb-error=0 --tool=memcheck enlightenment
==2586== Memcheck, a memory error detector
==2586== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==2586== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==2586== Command: enlightenment
==2586==
==2586== (action at startup) vgdb me ...
==2586==
==2586== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==2586==   /path/to/gdb enlightenment
==2586== and then give GDB the following command
==2586==   target remote | /usr/lib/valgrind/../../bin/vgdb --pid=2586
==2586== --pid is optional if only one valgrind process is running
==2586==


#46 Astroboy

Astroboy

    Member

  • Members
  • 323 posts
  • LocationZacatecas, Mexico

Posted 25 January 2017 - 10:12 PM

What happens when you change the last command a little bit, leaving out the =all ?

enlightenment_start -valgrind --show-reachable=no --vgdb-error=0 > output.txt 2>&1

Enjoy,

Charles

 

Doing this way produces the following output. As before, I was unable to produce the Moksha segfault on the nested display, but, as usual, was able to do it easily in the main display.

 

Spoiler

 



#47 Astroboy

Astroboy

    Member

  • Members
  • 323 posts
  • LocationZacatecas, Mexico

Posted 25 January 2017 - 10:25 PM

And these are the results of running

enlightenment_start -valgrind --show-reachable=no --vgdb-error=0 --leak-check=full > output.txt 2>&1

That is, with the "--leak-check=full" parameter added, suggested by Valgrind

 

I confirm that the leak/error found by Valgrind is the Moksha restart created when moving around a window in the main display. As I stated before, that Moksha segfault is not reproducible in the nested display, but it is easily reproduced in the main display by merely moving a window around and, fortunately, that is detected by Valgrind.

 

This issue continues spreading. Yesterday we installed in a computer labs with 42 computers, that had only two computer brands. Unfortunately, in both models the Moksha restart happen by merely moving a window, so all computers in the lab were unlucky ones to have this bug. The only one that didn't show this behavior was the teacher laptop...

 

 

Spoiler



#48 Jeff

Jeff

    Lead Developer

  • Developer
  • 12328 posts
  • LocationBloomington, IL

Posted 26 January 2017 - 06:47 AM

Out of curiosity - what theme are you using? Does changing theme affects the segfault?



#49 Jeff

Jeff

    Lead Developer

  • Developer
  • 12328 posts
  • LocationBloomington, IL

Posted 26 January 2017 - 06:53 AM

Still fumbling with valgrind a bit myself. Does running these commands work for you? Should create a new X session with Moksha launched:
 

sudo X -ac :1 &
export DISPLAY=:1
valgrind --show-reachable=no --vgdb-error=0 --tool=memcheck enlightenment


#50 Astroboy

Astroboy

    Member

  • Members
  • 323 posts
  • LocationZacatecas, Mexico

Posted 26 January 2017 - 09:00 AM

Out of curiosity - what theme are you using? Does changing theme affects the segfault?

 

The theme used is the "Default" one. Remember that we also reproduced this bug using a freshly created .e/ directory, that is, a Moksha configuration as shipped in Bodhi 4.0 without any change.

 

http://forums.bodhil...lt/#entry102125



#51 Astroboy

Astroboy

    Member

  • Members
  • 323 posts
  • LocationZacatecas, Mexico

Posted 26 January 2017 - 09:03 AM

 

Still fumbling with valgrind a bit myself. Does running these commands work for you? Should create a new X session with Moksha launched:
 

sudo X -ac :1 &
export DISPLAY=:1
valgrind --show-reachable=no --vgdb-error=0 --tool=memcheck enlightenment

 

 

 

Same results. Valgrind output some lines and then it does nothing, even after reproducing the Moksha segfault.

linux@linux-Dell-Precision-M3800:~$ sudo X -ac :1 &
[1] 2123
linux@linux-Dell-Precision-M3800:~$ export DISPLAY=:1

[1]+  Detenido                sudo X -ac :1
linux@linux-Dell-Precision-M3800:~$ valgrind --show-reachable=no --vgdb-error=0 --tool=memcheck enlightenment
==2134== Memcheck, a memory error detector
==2134== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==2134== Using Valgrind-3.11.0 and LibVEX; rerun with -h for copyright info
==2134== Command: enlightenment
==2134==
==2134== (action at startup) vgdb me ...
==2134==
==2134== TO DEBUG THIS PROCESS USING GDB: start GDB like this
==2134==   /path/to/gdb enlightenment
==2134== and then give GDB the following command
==2134==   target remote | /usr/lib/valgrind/../../bin/vgdb --pid=2134
==2134== --pid is optional if only one valgrind process is running
==2134==


#52 Jeff

Jeff

    Lead Developer

  • Developer
  • 12328 posts
  • LocationBloomington, IL

Posted 26 January 2017 - 01:46 PM

Does that launch a second Moksha instance on a new display server as intended?

#53 Jeff

Jeff

    Lead Developer

  • Developer
  • 12328 posts
  • LocationBloomington, IL

Posted 26 January 2017 - 08:23 PM

So I think I might have made some headway here. I get occasional locks on one of my systems and I was able to produce one in a Xephyr window today. Getting this spammed over and over again when it happens:

ERR<eo>lib/eo/eo.c:462 in lib/edje/edje_object.eo.c:338: func 'edje_obj_signal_emit' (719) could not be resolved for class 'Efl_Canvas_Group'.

So need to just chase down what is triggering this in the Moksha code base.



#54 Astroboy

Astroboy

    Member

  • Members
  • 323 posts
  • LocationZacatecas, Mexico

Posted 26 January 2017 - 11:18 PM

Does that launch a second Moksha instance on a new display server as intended?

 

Nope. No second Moksha instance appears with those parameters.



#55 Astroboy

Astroboy

    Member

  • Members
  • 323 posts
  • LocationZacatecas, Mexico

Posted 26 January 2017 - 11:22 PM

So I think I might have made some headway here. I get occasional locks on one of my systems and I was able to produce one in a Xephyr window today. Getting this spammed over and over again when it happens:

ERR<eo>lib/eo/eo.c:462 in lib/edje/edje_object.eo.c:338: func 'edje_obj_signal_emit' (719) could not be resolved for class 'Efl_Canvas_Group'.

So need to just chase down what is triggering this in the Moksha code base.

 

I believe that is the cause of the problem, too. I had seen many times the reference to that class in all the test logs.



#56 Astroboy

Astroboy

    Member

  • Members
  • 323 posts
  • LocationZacatecas, Mexico

Posted 27 January 2017 - 01:19 PM

Following the lead of gohlip http://forums.bodhil...cs/#entry102895

 

I was hoping that Moksha segfault was somehow solved in 4.1, but I didn't have luck. I can still reproduce it in Bodhi 4.1 installed as it comes, no additional packages installed nor any change in the default Moksha desktop.



#57 Jeff

Jeff

    Lead Developer

  • Developer
  • 12328 posts
  • LocationBloomington, IL

Posted 27 January 2017 - 01:36 PM

The packages in 4.1.0 are largely the same as what you've had for awhile now. Just a bumped version number and some new config files for the changed default theme. Still haven't been able to chase down where this segfault is coming from yet.



#58 Jeff

Jeff

    Lead Developer

  • Developer
  • 12328 posts
  • LocationBloomington, IL

Posted 27 January 2017 - 06:39 PM   Best Answer

Out of curiosity - does disabling the window geometry text in Window Display settings make the crashes stop for you?

oNKMVxs.png

I can't produce a lock on my system since disabling these. Going to look into fixing it, but for now this seems like a good work around.



#59 Astroboy

Astroboy

    Member

  • Members
  • 323 posts
  • LocationZacatecas, Mexico

Posted 02 February 2017 - 10:01 PM

You are a genius!!!!

 

With this change, the problem is solved!!!!

 

Now, what could be the possible consequences or side results of turning those settings off? (my reasoning is if they were turned on by default in E, should be a good cause to have it done that way...)



#60 Jeff

Jeff

    Lead Developer

  • Developer
  • 12328 posts
  • LocationBloomington, IL

Posted 02 February 2017 - 10:16 PM

All those check boxes do is display the current sizing / position of the window on the screen as it moves around / resizes.

Thank you for confirming this resolves the segfault. I am going to try and dig into the source that displays these for me and see if I can figure out what is causing it - for now we have a work around at least.

 

Sorry for this run around.






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users