Modify

Opened 4 years ago

Last modified 2 years ago

#1717 reopened defect

hwloc: lnuma & lltdl *really* required?

Reported by: zachetie@… Owned by:
Priority: minor Milestone:
Component: EinsteinToolkit thorn Version: development version
Keywords: Cc: zachetie@…

Description

I downloaded the ET devel version (ca. Nov 26) on my (ubuntu) laptop and compiled it, using gcc.

I compiled to the linker stage, and the linker complained:
ld: cannot find -lnuma
ld: cannot find -lltdl

I found references to these libraries in:
configs/[buildname]/bindings/Configuration/Capabilities/make.HWLOC.defn

After removing these references, the code compiled and seemed (on the surface) to run okay. Are these libraries really necessary?

I ask because every time I need to install ET on a new machine, it would be more convenient if the step "apt-get install libnuma-dev libltdl-dev" were left out, particularly since reliable Internet access may not exist at that time.

Attachments (0)

Change History (35)

comment:1 Changed 4 years ago by Erik Schnetter

The content of this file is auto-generated. Cactus's hwloc configuration script queries the hwloc installation what libraries are required to link against hwloc, and apparently this is the answer.

How did you configure hwloc -- are you sure that the version installed on your system is used, and that Cactus does not build hwloc? You can post the screen output of the configuration stage, which would allow us to decide if you can't tell.

Can you post the output of "pkg-config hwloc --static --libs" and/or "pkg-config hwloc --libs" on your system?

comment:2 Changed 4 years ago by zachetie@…

Looks like I installed the following packages on my Ubuntu 14.04 system prior to compiling ET:
libhwloc-dev:amd64
libhwloc-plugins
libhwloc5:amd64

pkg-config hwloc --static --libs

-lhwloc -lm -lnuma -lltdl -lpthread -ldl

pkg-config hwloc --libs

-lhwloc

After uninstalling the libhwloc-dev, libnuma-dev, and libltdl-dev packages, I recompiled ET from scratch, and there were no linker problems. So perhaps there is a bug in the ET build system when hwloc-dev is installed?

comment:3 Changed 4 years ago by Erik Schnetter

I do not think this is a but on the ET build system, as the ET gets this information from hwloc. I would suspect that the hwloc information is wrong. It may be that hwloc's package manager didn't realize that libhwloc-dev needs to depend on libnuma-dev and libtdl-dev.

If you don't install libhwloc-dev, then the ET will build hwloc from scratch instead of using the system version. This will always work.

comment:4 Changed 4 years ago by anonymous

Yes, you seem to be correct about the case in which libhwloc-dev is installed.

I think the problem lies with configure.sh, as it requests data from

"pkg-config hwloc --static --libs".

Given the output from pkg-config commands above, is it possible that changing configure.sh to call

"pkg-config hwloc --libs"

instead (i.e., without "--static") would fix the problem?

comment:5 Changed 4 years ago by Erik Schnetter

This may or may not circumvent the problem. I suspect it would. At some point we decided to use static libraries as much as possible when building Cactus, since this reduces dependences on things that can change after the executable has been built. I don't think this is a priority any more. You could try this; node that hwloc's configuration script already omits the --static as a fallback, if pkgconfig does not support this option.

In general, if there is a machine where something is broken, we use Simfactory's option list to circumvent the problem. In this case, we would set "hwloc=BUILD" in the options, probably accompanied by a comment explaining why.

comment:6 Changed 4 years ago by Frank Löffler

Given that the manpage for pkg-config already mentions:

       --static
              Output  libraries  suitable  for  static linking.  That means including any private
              libraries in the output.  This relies on proper tagging in the .pc  files,  else  a
              too large number of libraries will ordinarily be output.

my first guess would be that hwloc itself is at fault: nothing we could fix.

comment:7 Changed 4 years ago by Ian Hinder

Close as invalid?

comment:8 Changed 4 years ago by Erik Schnetter

No; I asked Zach a question, and am waiting for his reply.

comment:9 Changed 4 years ago by zachetie@…

Hi Erik,

I like the idea of a workaround within ET/hwloc, e.g., by disabling the --static within hwloc/configure.sh unless a static build is explicitly requested. Speaking of which, is there a configuration option to compile ET statically? If not, the default options should at least be consistent with static or dynamic, and I would argue that the default should be a regular, non-static compilation (i.e., _without_ the --static option within hwloc/configure.sh).

comment:10 Changed 4 years ago by Erik Schnetter

Zach, does it actually work if we omit the --static?

comment:11 Changed 4 years ago by Zach Etienne

Erik,

Yes, I just verified that the compile does work if the --static is omitted, and fails if --static is included.

comment:12 Changed 4 years ago by Erik Schnetter

As a general rule, we don't want to have work-arounds for particular systems in Cactus since this is very fragile. Over time, these work-arounds accumulate, and in the end no one remembers why somethings are done in a particular and very complex way. Sometimes we find comments about systems that no one even knows any more. For example, do you know why the C++ compiler on OSF systems requires the option "-noimplicit_include"? Does anybody even know what OSF is, without resorting to Google? (Hint: It is a version of Unix.) Apparently this option was introduced in 1999... I'm quite sure that the respective logic can be deleted and no one will ever notice, but no one has time to deal with these kinds of clean-ups because, sometimes, these work-arounds have subtle side-effects that break things when they are removed.

As I mentioned before, it is very easy (a one-line addition) to update the option list for your machine to avoid this problem. Also, since your system seems broken, you'd have to explain why you cannot correct your install, and why you think that modifying Cactus instead is a better idea.

However, you raise the questions whether Cactus should be linked statically or dynamically by default. These days, we tend to like static linking because (a) disk space usage is not really a concern, and (b) this means that executables are more independent once created. If you have a dynamically linked executable and then uninstall a certain library, it may break. This is an issue on supercomputers where production runs may take weeks or months, and where someone may have installed a library into his/her home directory and others are then using it. Once an executable is broken this way, it is very difficult (if impossible) to repair it. Static linking avoids this issue.

As a bonus, static linking also uncovers errors (duplicate symbols) that may go undetected with dynamic linking.

comment:13 Changed 4 years ago by Zach Etienne

I am agnostic about whether static or dynamic linking should be chosen as default, though I anticipate more headaches if static were the default (you brought up the standard reasons).

Further I would argue that we should make *one* choice as default and stick with it consistently, and not something between static and dynamic, as it creates confusion (this case for example).

comment:14 in reply to:  12 Changed 4 years ago by Frank Löffler

I agree about the problem with workarounds. However:

Replying to eschnett:

As I mentioned before, it is very easy (a one-line addition) to update the option list for your machine to avoid this problem. Also, since your system seems broken, you'd have to explain why you cannot correct your install, and why you think that modifying Cactus instead is a better idea.

This seems to be a generic Ubuntu installation. It is not an isolated machine. New users are not unlikely to hit the same issue, unless they know to choose the (then fixed) option list. Also, if this is a problem with the .pc files in hwloc, then this is likely a problem even outside of Ubuntu.

However, you raise the questions whether Cactus should be linked statically or dynamically by default. These days, we tend to like static linking because (a) disk space usage is not really a concern, and (b) this means that executables are more independent once created.

This might be true on a supercomputer. I certainly like dynamic libraries more when I develop, i.e., most of the time on my laptop/workstation. So, I would answer this with "it depends". I would think both should work.

comment:15 Changed 4 years ago by Erik Schnetter

Frank -- what do you suggest concretely?

If you think that a standard Ubuntu may be broken, then we should update our standard Ubuntu option list. We can either make it build hwloc ourselves, or at least add the missing packages to the comments at the top of this list.

There is always a dilemma between using a pre-existing library and building things on our own. I usually prefer to build my own, since this is more likely to work. Others prefer using existing libraries. In this case, we probably need to extend the extent to which we test existing libraries before we use them.

comment:16 in reply to:  15 Changed 4 years ago by Frank Löffler

Replying to eschnett:

Frank -- what do you suggest concretely?

If I read the following correctly, this could be a rather interesting problem:
http://www.open-mpi.org/community/lists/hwloc-devel/2013/05/3743.php

The problem seems to be that when used as dynamical library, hwloc does _not_ depend on libltdl as "usual dynamic library", but can load it later by itself, if found (dynamically, but ldd does not see that). If built statically of course, ltdl would need to be linked in for it to be usable. So, if I read this correctly, the hwloc-dev package has the option of depending on the other two libraries at build-time, but does not have to (it builds without these libraries, but it can use these libraries in a dynamic setup). Does anybody here agree to my interpretation?

That leaves the user with an installed hwloc that uses both numa and ltdl if present (because it was compiled with support for them), and the library correctly reports the link-dependencies both for dynamic and static linking. However, the -dev package does not have a dependency on the other two library packages because they are optional, at least for the dynamic libraries that are typically used on the system. So, the problem is: should the hwloc-dev package depend on the numa/ldtl libraries? It should for static linking, and should not for dynamic linking. Since the hwloc package only provides the dynamic version, and since by far the majority of users would link dynamically, their decision to not add the dependency is correct in my eyes.

Now the question would be: what do we do in this situation? We could test for these libraries to be present when we build hwloc and link statically, and give a better error message.

Last edited 4 years ago by Frank Löffler (previous) (diff)

comment:17 Changed 4 years ago by Erik Schnetter

No, the hwloc package manager's decision is not correct. libhwloc-dev provides a library libhwloc.a that cannot be used without libltdl.a. Thus, it either needs to depend on the libltdl-dev package, or needs to provide a library libhwloc.a that does not have this dependency.

If you want to provide a work-around, then I suggest to do this in the system-specific file "ubuntu.cfg" of Simfactory. Of course, you can also check in hwloc's configure script whether we are using the system hwloc library, whether it depends on libltdl, whether this library is installed, etc., and if so, refuse to use the system library. We usually don't go to these lengths when looking for existing libraries, though. If you really want to go this route, then I would suggest using autoconf or cmake for this, which provide exactly this kind of functionality.

comment:18 in reply to:  17 Changed 4 years ago by Frank Löffler

Replying to eschnett:

No, the hwloc package manager's decision is not correct. libhwloc-dev provides a library libhwloc.a that cannot be used without libltdl.a. Thus, it either needs to depend on the libltdl-dev package, or needs to provide a library libhwloc.a that does not have this dependency.

Yes, I missed the .a file in the -dev package. I assumed this to be in the hwloc package, if at all present. In this case, it would indeed need to be reported to the Ubuntu package maintainers. The respective Debian package has the missing dependencies, but not yet in the released version - which means Debian is likely affected too, in the current release.

If you want to provide a work-around, then I suggest to do this in the system-specific file "ubuntu.cfg" of Simfactory.

I don't think that the current option lists for these specify a static build. We should probably give the correct flag (--static or not) to pkg-config, and in this case that should work, shouldn't it?

comment:19 Changed 4 years ago by Erik Schnetter

When you link against a library, then you need to decide whether to link statically or dynamically. That is independent of whether other libraries are linked statically or dynamically, or whether the final executable is a dynamic library. Here, we choose (as in many other external libraries) static libraries. I explained the reasons for this above.

comment:20 Changed 4 years ago by Erik Schnetter

Resolution: fixed
Status: newclosed

Fixed by disabling the system hwloc in ubuntu.cfg

comment:21 Changed 4 years ago by Ian Hinder

Should this apply also to debian, where Frank says (comment:18) the problem also exists?

comment:22 Changed 4 years ago by Ian Hinder

Resolution: fixed
Status: closedreopened

Note that this problem was reported earlier in #1632.

comment:23 in reply to:  21 Changed 4 years ago by Frank Löffler

Replying to hinder:

Should this apply also to debian, where Frank says (comment:18) the problem also exists?

Only if it really turns up to be a problem there. The package dependencies look similar, but I can build the current version (at least last timeI tried). Maybe the static library does have different dependencies there.

comment:24 Changed 4 years ago by Roland Haas

There is a similar problem with numa lib, static linking and self-build hwloc. Currently detect.sh only claims hwloc as a required library (HWLOC_LIBS=hwloc) when building hwloc, however hwloc's configure script will find libnuma and use it, which makes numa a required library in that case only. A workaround right now is to add numa to LIBS but this has to be done by the user since whether numa is required depends on both the software installed on the machine as well as hwloc's configure script.

Right now there seems to be no good way to fix this, since detect.sh sets HWLOC_LIBS before configure runs so cannot know what configure will do. As far as I can tell, the only way to fix this is to run the ExternalLibraries configure script in detect.sh but then defer building to build.sh.

comment:25 Changed 4 years ago by anonymous

Can pkg-config be run after configure, but before building (and installing) the library? If not, we would also need to have another way of finding the libraries than pkg-config.

comment:26 Changed 4 years ago by Roland Haas

Milestone: ET_2015_05

comment:27 Changed 3 years ago by Frank Löffler

Milestone: ET_2015_05ET_2016_05

comment:28 Changed 3 years ago by Frank Löffler

The current proposed, short-term (release), dirty solution (thanks, Steve) would be: within detect.sh emulate 'configure' of the library by calling the compiler with a minimal file and, e.g., -lnuma, and if that succeeds we assume the compiler found libnuma, could use it, and 'configure' of hwloc will find it too.
This is obviously not the best solution, it is a hack, and a dirty one. It assumes stuff that can, and probably will go wrong on some machines. However, it should work on most machines, and it would be not very invasive, especially so short before a release.

Any opinions on this before we try it?

comment:29 Changed 3 years ago by Erik Schnetter

I would not make changes just before a release. I would provide a stop-gap solution that enables people to build the ET. Since there seems to be a work-around via "apt-get", I would go with this.

comment:30 Changed 3 years ago by Frank Löffler

There are two problems mixed in this ticket. The first is the one involving system libraries, and I agree that at least for the release specifying a list of packages is a viable workaround.

The other problem, however, revolves around a self(Cactus)-built hwloc. The underlying problem is similar, but here we cannot point people to an option list, because this can happen on any machine.

comment:31 Changed 3 years ago by Erik Schnetter

In this case people can set HWLOC_EXTRA_LIBS in their option list.

comment:32 in reply to:  31 Changed 3 years ago by Frank Löffler

Replying to eschnett:

In this case people can set HWLOC_EXTRA_LIBS in their option list.

Interesting possibility. I wouldn't expect that to be necessary if Cactus builds hwloc itself.

comment:33 Changed 3 years ago by Frank Löffler

Milestone: ET_2016_05ET_2016_11

Given the proximity of a release, and that this is not a regression (was the case also in older releases), I propose to leave it as is for now.

After the release I propose to open the possibility to have the build scripts (build.sh) overwrite/set Cactus build variables, specifically HWLOC_LIBS in this case, but also others if need be (as detect.sh). The "only" issue I see with this is that it would need to be done in a way that parallel builds still work.

comment:34 Changed 3 years ago by Frank Löffler

Also, note that there is another ticket about hwloc: #1753. That's not strictly the same issue, but whoever fixes one should know about the other.

comment:35 Changed 2 years ago by Frank Löffler

Milestone: ET_2016_11

Modify Ticket

Change Properties
Set your email in Preferences
Action
as reopened The ticket will remain with no owner.
Next status will be 'review'.
as The resolution will be set.
to The owner will be changed from (none) to the specified user.
The owner will be changed from (none) to anonymous.

Add Comment


E-mail address and name can be saved in the Preferences.

 
Note: See TracTickets for help on using tickets.