Tuesday, June 30, 2009

p2 UI policy and Declarative Services

This is another post in what is becoming a short (so far only two) series about moving a product from 3.4 to 3.5.

After I got my build working, the next step was making sure that I could update from one product version to the next. I was especially excited about the resolution of https://bugs.eclipse.org/bugs/show_bug.cgi?id=246060 which allows for .qualifier to be replaced in a product version. No longer would I have to manually increment the product version number for purposes of updating to and testing a nightly build.

So my plan was:

  1. Run PDE product build to generate version 1.0.0.abc
  2. Unzip 1.0.0.abc to some location.
  3. Run PDE product build again to generate version 1.0.0.def
  4. Launch 1.0.0.abc, point it at the repository for 1.0.0.def, and update.
  5. ...
  6. Profit.

Unfortunately, when I launched 1.0.0.abc, the Install New Software dialog didn't have a way for me to add a new repository. Ditto for the preference page.

Turns out there is a more robust set of p2 UI building blocks in 3.5, which is handy for RCP developers. That is described in great detail here: http://wiki.eclipse.org/Equinox/p2/Adding_Self-Update_to_an_RCP_Application

I should mention that the RCP-p2 example in 3.5 is leaps and bounds ahead of the one from 3.4 (there wasn't one) - so props to the p2 UI team on that.

At any rate, the wiki page tipped me off that there is a UI policy which controls what components are showing and enabled. This policy is implemented as an OSGi declarative service. What really threw me for a loop is that I wasn't trying to do anything special with this policy. I just wanted the stock SDK one since our product is based on the SDK.

Debugging the Policy Behavior

I stepped through the preference page code and discovered that the SDKPolicy wasn't getting discovered as a service (it was just getting an empty Policy every time). So this sent me down the route of launching with -console to see the OSGi console and look for the policy service. After fighting with the filter syntax for the services <filter> console command, I googled a bit more and found these useful runtime options for spitting out verbose DS logging information. I turned those on but I didn't get anything logged. I was pretty stumped at this point.

Then a light bulb came on: maybe declarative services wasn't running at all? A quick ss ds at the console showed that it was RESOLVED but not active! I did a start to spin it up and all of a sudden a deluge of DS logging information printed out. And then SDKPolicy started working, and voila my p2 UI was working.

It turns out the root cause is that we had a custom config.ini in 3.4 to specify a custom osgi.instance.area location. This was screwing up the start level for the ds bundle. I switched the product to generate a config.ini for me, did a new build, and everything worked. I plan to migrate the osgi.instance.area configuration step to a p2.inf file, which is what the platform releng guys do.

Useful Links

[1] Equinox Runtime Options
[2] Explore Eclipse's OSGi Console
[3] Around the world in Java: Getting Started with OSGi Declarative Services
[4] p2 UI policy bug #1
[5] p2 UI policy bug #2

Monday, June 29, 2009

Debugging PDE Build and the publisher

I posted a problem to the PDE newsgroup last week about unexpected requirements in my product feature. This was in the context of moving a 3.4-based product to 3.5.

The general issue was that the director wouldn't install my product because of an unsatisfied requirement. It wasn't clear to me where this requirement was even coming from. Somewhere, there was some metadata in my plugins/features that expressed a dependency that had worked fine in 3.4 but failed in 3.5. My theory was that if I could capture when the publisher was generating the requirement, I'd be able to see the source of that requirement and squash it.

Tracing

First attempt was to turn on tracing for the p2 components. I managed to find the org.eclipse.equinox.internal.p2.core.helpers.Tracing class which listed out the different options. I stuffed those into a .options file:

org.eclipse.equinox.p2.core/debug=true
#org.eclipse.equinox.p2.core/generator/parsing=true
#org.eclipse.equinox.p2.core/engine/installregistry=true
#org.eclipse.equinox.p2.core/metadata/parsing=true
#org.eclipse.equinox.p2.core/artifacts/mirrors=true
#org.eclipse.equinox.p2.core/core/parseproblems=true
#org.eclipse.equinox.p2.core/planner/operands=true
#org.eclipse.equinox.p2.core/planner/projector=true
#org.eclipse.equinox.p2.core/engine/profilepreferences=true
org.eclipse.equinox.p2.core/publisher=true
#org.eclipse.equinox.p2.core/reconciler=true
#org.eclipse.equinox.p2.core/core/removeRepo=true
#org.eclipse.equinox.p2.core/updatechecker=true

Then the trick was to pass along those options to the AntRunner app which drives PDE build. I added -debug path/to/.options into my arguments to AntRunner. Running the build again I got two things, neither of which were helpful:

  1. Passing -debug to the Platform also passes -debug onto Ant, thanks to AntRunner. So my Ant ran in debug mode which really clouded the issue with about 8mb of debug output.
  2. The publisher only outputs two trace statements: start and finish. Nothing about what it is publishing. This may be a candidate for enhancement.

Based on these results, I reasoned that nobody else must be using this technique to solve their p2 problems. Moving on.

Stepping through the publisher

Next up: run AntRunner with Java debug enabled so that I could connect remotely and set breakpoints in the publisher actions. I added the appropriate JVM args to enable the Java wire debug protocol. Started the build again, connected up and started setting breakpoints in various publisher actions.

Since the rogue requirement was getting added to my product feature IU, I added a conditional breakpoint in FeaturesAction to look for that feature being processed.

Then, since the problematic requirement was org.eclipse.core.resources [3.4.0,3.5.0) I added another conditional breakpoint in getVersionRange to watch for incoming feature entries with 3.4.0 as their minimum version.

I did finally discover the problem: I had a bunch of old, outdated entries in my product feature's feature.xml, which included references to several different versions of o.e.core.resources. After I ripped those out, I had a successful build and director install.

Conclusions

  • Do not pass the debug flag to AntRunner for purposes of debugging platform code unless you are prepared to wade through volumes of output. (I guess this is a feature of AntRunner - https://bugs.eclipse.org/bugs/show_bug.cgi?id=5672)
  • It was not at all apparent to me to debug p2 actions by setting up a "remote" debug session with PDE build running inside of AntRunner. But it was sure as heck helpful once I figured it out.
  • I am actually glad that I ran across this problem, and that p2 is enforcing these types of constraints, because it helped me clean up outdated dependencies in my feature.

How are you debugging your p2 builds??