Sun, 21 Sep 2008
For the impatient reader: New lintian in
experimental, please test and give feedback. You will miss most
changes though unless you read the rest of the post
(Hint, Hint ;))
During the past week I've uploaded new lintian versions to experimental
which we designated to be release candidates for 2.0.0. Code-wise the changes
are not that much more intrusive than for many of our past releases, but
they change the way lintian classifies tags in a fundamental way, thanks
due to the hard work of Jordà Polo in his
Google Summer
of Code project (mentored by Marc Brockschmidt).
Lintian Tag Classification, old and new
Previously lintian classified tags only in one dimension, in the
categories "Info", "Warning", and "Error". While this worked reasonably
well, the difference between the categories was not very well defined.
The general idea was that everything violating a "must" in Debian Policy
or endangering the building or usage of the package should be an "Error",
i.e. something very similar to the definition of RC bugs (except that
not all "must"s in Policy are deemed worthy of filing RC bugs). Some
errors were downgraded to "Warning" or even "Info" though on the basis
that their detection was too prone to false positives. Due to this
it was a long existing desire to split the classification of tags into
two dimensions, one for the impact/importance of the tag, and one for
the certainty of its correct detection. This should make it easier for
people to interpret and/or filter the output.
At various points in the last few years people began to work on this
but quickly gave up, usually overwhelmed by the sheer number of tags
(728 in 2.0.0~rc2) to classify anew and to make sure that the old and
new categorisation could exist side-by-side (because breaking backwards
compatibility was not really feasible).
Finally this year Jordà Polo decided to tackle this task as a Google
Summer of Code project, with great success. Tags are now classified in
two dimensions "Severity" (with the possible
values wishlist, minor, normal, important, serious, which are
intentionally very close to the available severities in the Debian bug
tracking system), and "Certainty" (possible values: wild-guess, possible,
certain). A third classification by "Source" (i.e. Policy, Developers
Reference, ...) is planned but not yet fully implemented.
For backwards compatibility there is a mapping of these new
classification to the old ones (which lead to a few reclassifications
of tags). The default output of lintian is unchanged. The new output
formats that support the classification are still experimental (see
below).
How to use it
You can specify exactly which levels of Severity and Certainty you
want to have displayed with the new --display-level (-L)
option. Please see the manual page for the details, but to give you an
idea, the default behaviour (i.e. "show warnings and errors" in the "old"
vocabulary) is equivalent to specifying
-L ">=important" -L "+>=normal/possible" -L +minor/certain
And to get a report with only severe tags we're very certain of,
you could use
-L ">=important/certain"
which will only display tags that have severity "important" or "serious"
and a certainty of certain.
There is also the (intentionally undocumented) option --exp-output which allows you to play with some experiments we're doing with
the output format.
--exp-output format=letterqualifier will give you an output
very similar to the "classic" one, but with additional information about
severity and certainty. --exp-output format=colons gives you a
colon-separated format which includes all the possible information lintian
currently has available during tag output and which should be easily
machine-consumable. Note that these formats are experimental and might
be changed at any point without notice. If you're interested in
using alternative formats for lintian output, please join
the mailing list
and talk to us about it.
Etc.
Other changes include the usual share of bug fixes and of course:
New tags
- description-contains-dh-make-perl-template
- doc-base-uses-applications-section (actually a split of
doc-base-unknown-section in two tags)
- embedded-pear-module
- embedded-php-library
- improbable-bug-number-in-closes
- maintainer-also-in-uploaders
- maintainer-script-ignores-errors
- manpage-has-errors-from-pod2man
- ored-build-depends-on-obsolete-package (actually a split of
build-depends-on-obsolete-package in two tags)
- package-superseded-by-perl
- versioned-dependency-satisfied-by-perl
- windows-devel-file-in-package
Credits
This lintian release is brought to you by (sorted by number of changesets):
- Jordà Polo
- Frank Lichtenheld
- Adam D. Barratt
- Raphael Geissert
- Russ Allbery
- Niko Tyni
- Marc 'HE' Brockschmidt
Sun, 14 Sep 2008
Since I can't upload git 1.6 from experimental to backports.org
(because that is only for packages available in testing), I've made a
backport to etch available at people.debian.org.
If you have any issues with that that don't affect the version in
experimental, feel free to drop me a mail.
Thu, 21 Aug 2008
(This is a more detailled and hopefully better structured explanation
of the topic of my Lightning Talk during Debconf)
With the ever growing number of packages in our archive we need
ever better methods to structure, sort, and search this collection.
While projects like debtags
try to improve the search in packages, my idea is about improving
the way the results are displayed.
There is one relation between packages that is currently never strongly
featured when presenting package meta-data:
the source ↔ binary relation.
Usually package managers operate either on binary packages
or source packages, but seldomly do they expose the fact that
one is generated from the other.
In the trivial case of one source package producing one binary
package there is probably nothing to gain. But there are a large
number of source packages where this isn't true (at the moment
about 29% of all the source packages in main build more than
one binary package, together producing about 61% of all binary packages).
I have the idea (which so far is nothing more) that if we could
present packages through their source packages that this could drastically
reduce the number of packages to choose from and make some common
search actions more effective.
This is not by any meaning a trivial task, though, since on the
other hand not all source packages necessarily produce binary packages
that have much to do with each other, and even for the ones that do,
the exact relationship between the binary packages can vary greatly.
Also some source packages are really belonging together but are split
to make maintenance easier (e.g. KDE and X.org). These splits might not
necessarily make any sense to an user.
Some ideas what to implement in terms of code to make it easier to
let the user benefit from the fact that multiple binaries have the
same source:
- Add support for a Description field for source packages.
- For bonus points implement automatic substvars ${source:Description},
${source:Short-Description}, and ${source:Long-Description} that make
it trivial to reuse (parts of) that description in the descriptions of
the binary packages.
- Add some way to identify the "main" package built from a common
source.
- Identify source package usage patterns.
- There are a lot of source packages that share common patterns
about what they build. Example patterns include:
- Application
- Main package + -data package + -doc package + ...
- Library
- -dev package + shlib package + -doc package + ...
- Application + Library
- Collection of Applications
- etc.
I guess many of us already look out for these patterns when searching
for packages but identifying them explicetly might make this easier
and allow more users to benefit from this information.
As everyone can see in the
graph of release
critical bugs, the number of RC bugs in etch had constantly risen
since the release (which was already noted in some other blogs) up until last
week. Since I suspected that this number included a lot of false
positives, I began to triage this list last week during Debconf.
I started with identifying all bugs that were filed against the
version in etch, but really only meant the package in testing/unstable
(but can't be interpreted correctly with version tracking alone since
the version in stable, testing, and unstable is actually identical).
There are a lot of
reasons why an RC bug might be valid for testing/unstable, but not
for stable, even though they share the same package:
- Library transitions.
- While most cases of library transitions can be handled with
targeted binNMUs these days, there remain a few cases where a
sourceful upload is needed, e.g. for renamed -dev
packages and necessary source changes to adapt to new APIs.
- Build-Problems caused by toolchain changes.
- Newer gcc versions are often more strict about what constructs
they allow. This can cause build problems in a lot of packages
once the default version used is increased. Other packages
that regulary cause new build problems are linux-libc-dev
and dpkg-dev. Since we only require that a package is
buildable in the release it is contained in, these are not
valid RC bugs for stable.
- Changes in the RC policy of the release team.
- One example during the lenny release cycle was the use of
invoke-rc.d in maintainer scripts to (re-)start daemons. Bugs
were filed against a lot of packages not using invoke-rc.d
before the etch release but they were only declared to be of
RC severity after etch.
- etc.
So as the first part of my triaging efforts I tried to identify
these cases. I also looked for bugs that were listed as affecting
etch due to incomplete version information (i.e. if there is no
version given, the bug is assumed to affect all versions).
As you can see from the bump in the graph I was able to identify
about 200 bugs that met one of these conditions.
We should avoid in the future to let the number of false positives
grow to such large numbers. Everyone can help with that:
- Maintainers
- If you get bugs reported without a version number and you
can verify that this bug was not present in an older version
of the package, add correct "found" information. If you get
a bug reported against a version that is both in stable and in
testing/unstable, but not valid for stable, tag the bug
appropriatly. I would recommend to err on the side of false positives
though instead of on the side of false negatives!
- Mass-bug filers
- Most mass-bug filing that involve RC bugs should use tags to
avoid creating false positives.
- QA Group
- Bugs filed about the proposed orphaning or removal of packages
should usually be tagged, since only in very few cases these
actually warrant any changes in stable.
At this point you probably ask yourself: What are the appropriate
tags? This is much less clear than one might hope, since the involved
tags (<suite>) changed/lost their meaning somewhat with the
introduction of version tracking. Following
my
question about the appropriate tags on debian-release the release team
and Don Armstrong for the debbugs maintainers seem to have agreed on
their new meaning: A bug with suite tags affects the intersection of
the set of suites indicated by its version information and the set of
suites indicated by its suite tags.
Next post: What about the other 600 RC bugs in stable?
Thu, 07 Aug 2008
I had to move a mailing list from SourceForge to a self-hosted Mailman
instance while preserving all the user options. Since one has no shell access
to these SF's Mailman I decided to extract the information from the Web-Interface,
which sadly enough is no easy task. There is no complete list available but only
chunked by starting letter and in groups of 30 addresses or less.
Since the mailing list in question had about 1500 subscribers manual transcription
was really no option. So I wrote a small script that automatically extracts all the
information and outputs it in a CSV-like format.
I've also hacked Mailman's add_members script to set all these options from this
format.
In case someone finds this useful, both are available on my git
server under free licenses:
- grab-subscribers.pl
- Extract subscriber information from Mailman Web-Interface with LWP.
- add_members
- Use the information extracted by grab-subscribers.pl to populate a Mailman mailing
list.
DISCLAIMER: This is hacked together really quickly and was used exactly once. Don't expect too much.
Fri, 13 Jun 2008
packages.ubuntu.com is now
hosted on a server provided by Canonical. This will hopefully greatly improve
performance and reliability, since my own server was increasingly swamped and
had repeated problems with its hard disks. Many thanks to Chris Jones for
handling the move.
Mon, 02 Jun 2008
I really hope the sponsoring situation still improves, but I've decided to go
either way. Judging from the last years, it's worth it :)
Tue, 13 May 2008
Just the other day I was wondering about what release of Debian a specific version
of a package was in, and I knew it was older than oldstable. But searching the Packages
files of archive.debian.org was tedious and
I thought that there has to be better way. So I quickly set up a packages.debian.org
instance for archive.debian.org at archive.debian.net
I guess this isn't really interesting for the day-to-day use, but maybe someone can ocassionally
profit from it.
Caveats:
- There are no binary files on archive.d.o for rex and buzz, so currently no information
is available about them.
- There are no Sources files available for rex, buzz and bo, only Packages, so no information
about the source packages can be presented.
- The two above point of course mean together that there is no usable information at all about
rex and buzz. If someone would create the missing indices and convince the ftp-masters to
put them on archive.debian.org, I will gladly configure archive.debian.net to use them.
- Changelogs and copyright files are currently only available for pool-using releases, since
the extraction script has some assumptions that depend on that. On archive.debian.org,
the only release that uses
pool/ is woody.
Feel free to help me improve that site, preferably by sending patches against the archive-master
branch of git://git.debian.org/git/webwml/packages.git.
Fri, 22 Feb 2008
I will see you all at my talk, right? ;)
Wed, 06 Feb 2008
- packages.ubuntu.com is now fully migrated to the new server.
Since the old server was not shut down for four days after my contract expired, the actual downtime was very
small (yay for busy admins ;)).
- After completing this move I now plan to finally migrate this site to the newer code base as
used on packages.debian.org. My test side with the freshly merged
code can be found at packages2.ubuntu.lichtenheld.net
(no comments about ugly hostnames please...). Comments and patches (especially to the .css files) welcome.
The location of the git repository can be found on the About page.
- New gimmick in the general code: The site now checks the Accept-Language header twice, once against the
list of DDTP translations and once against the list of template translations. So, given a suitable Accept-Language
header it will not necessarily fall back to English in all cases anymore. For all the bi-, tri-lingual people out
there :). Not yet online on the main server, but coming soon.
Fri, 01 Feb 2008
- I've moved packages.ubuntu.com to a new server but until one day ago forgot to
tell anyone about that :( The old server will probably go offline today and it might
be some time until the IP address gets corrected in DNS and until these changes arrive
at your DNS server. Sorry for any inconvienience caused. You can access the new server
under the alternate address packages.ubuntu.lichtenheld.net
for now.
- I've improved the search code for packages.debian.org a bit to give more usefull results even
if the keyword is very generic. Feel free to test the new code at the usual
address before I put it online on packages.debian.org. I will wait some days to give the translators
a chance to catch up before doing that.
- If you are interested in packages.debian.org development and you will attend FOSDEM,
please note that I will give a talk about it there in the Debian DevRoom. I would be glad if someone
shows up ;)
Sat, 08 Sep 2007
I get regulary annoyed by the fact that I find random -dev packages on my main system and
can't figure out which package's build-depends I satisfied by installing them. Also they don't
disappear if the package changes its build-depends.
So I decided to solve this problem. Since
I'm not really a C/C++ hacker and didn't want to learn hacking APT for this I choose to solve this
problem with brute force and Perl ;-)
The result is sourcedeps.debian.net, a APT archive
which contains one binary package for each source package. The following mapping was done:
source package name => binary package name, but with appended '-build-depends'
Build-Depends => Depends
Build-Depends-Indep => Recommends
Build-Conflicts => Conflicts
Binary => Suggests
Binary => Provides (with appended '-build-depends')
If any of the Build-Depends fields contains arch limiters, arch-dependent packages will be
created, otherwise one arch-independent package.
This allows easy installation of build-depends by installing the corresponding meta package,
tracking them, and removing them automatically in case you deinstall the meta package. It only
works for known architectures though and it requires creating around 60,000 binary packages.
It also doesn't allow tracking build-dependencies for more than one version of a package (e.g.
unstable and experimental).
You can use this by adding something like
deb http://sourcedeps.debian.net/ sid main contrib non-free
to your sources.list. The archive is signed with the following key, available from
a keyserver near you and signed by me:
pub 1024D/ED505694 2007-09-08 [expires: 2008-09-07]
Key fingerprint = 4ECF DF07 F419 0B5B 45C4 51D0 00E9 C47B ED50 5694
uid SourceDeps.Debian.Net Archive Key <archive@sourcedeps.debian.net>
Comments welcome.
Sat, 09 Jun 2007
Too make the pages for single packages on packages.debian.org less loaded with
information I experimented today with changing the layout to a "tabbed" one, so that
information is spreaded over several sub pages. Currently this is implemented with
Javascript, but if people really like it I would probably implement it on the server side
too. Please check it out and tell me what you
think.
Sat, 02 Jun 2007
Copy of a mail I sent to debian-www earlier today. I don't think
it warrants posting to -devel-announce, but I post it here to make it
visible to people not usually following debian-www.
Since I seem to sense an increased stream of offers to help out with
packages.d.o coming my way
(but maybe that is just wishful thinking ;) I wanted to give a short
update on the development of the current code and the status of the
infrastructure so that nobody can claim he wanted
to help but failed due to lack of information.
- The code in the CVS is seriously lacking in many regards, especially
update and CGI speed. It runs pretty stable though and since we
excluded robots from using the CGI scripts it seems to run with at
least enough speed to keep any big complaints from rising.
(If anyone is interested, packages.d.o currently has
about 200,000 page hits a day, with a notable decrease
– up to 50,000 – during the etch freeze and a similar increase after
release)
- packages.d.o is currently run on puccini.d.o, this host is
exclusivly used for this purpose. Adminstration is be done
by group pkg_maint, current members are Martin 'Joey' Schulze
and me.
- The same code is also used to create the page
packages.ubuntu.com, see
branch ubuntu in CVS.
packages.ubuntu.com is run a private server of mine and
only administrated by me.
- Everybody with write access to the website has also write access
to this part of the CVS. The code run on packages.d.o is
updated once a day during the update cron job. Since there
is no staging ground for changes, any commits should be made
with extreme care...
- Last year Jeroen van Wolffelaar and I started to develop a new
version with the goal to make dynamic page generation possible
which allows for faster update of information and more flexible
presentation. We coordinated our work by using a SVN repository
located at svn.wolffelaar.nl.
- This development was stalled several times when he and/or I had no time
to actively pursue it.
- In April 2007 I decided to revive the development to get the code
in a state ready for deployment. Since I was by then inititated
in the wonders of distributed scm, I decided to move the code to
yet another repository, namely
git
(for cloning use git://source.djpig.de/git/packages.git)
There is also a ubuntu branch there but it is in a rather
sorry state atm regarding site layout.
- I think the new code is currently in a state where it could be
safely deployed to packages.d.o but I'm currently waiting for
an etch upgrade of the host since the code has grown some dependencies on
stuff only available in etch. The next possibility for this
to happen is probably during Debconf.
- I don't know yet how to handle the SCM stuff when the code
is to be deployed.
- You can try out this new code base at
packages.debian.net
(Might be slightly out-of-date sometimes since I don't run the
cronjob as often as possible to give packages.ubuntu.com more
ressources)
- Bug reports and patches against this version are very much
welcome,
please send them directly to me (but feel free to CC debian-www
if you want to have an open discussion on the matter in question).
- Bug reports and patches against the version currently in use should
be directed to the BTS as usual.
- If there is interest we can make a little improvised BoF about
packages.d.o at Debconf.
Thu, 23 Nov 2006
For about five months now the
Debian package of
pbbuttonsd builds a
package on i386, too, since it might actually be useful to have
these available on MacBook (Pro) machines as well. What I don't know
(since up until now I never actually used or even saw a MacBook with
Linux installed) is whether any significant part of the functionality
is working there or if it all depends on hardware that was specific
to the PowerBooks and iBooks (e.g. PMU).
I've decided that I will remove the i386 binary again before etch
if nobody claims that it has proven useful too him. So if there
are any users out there that would actually miss that package, please
speak up now...
Fri, 03 Nov 2006
... or "why breaking your SONAME is a bad idea".
If you ever had any questions on why it is a bad idea to break your
SONAME (i.e. not changing it despite of ABI and/or API changes in your
library), please take a look at the current
Debian bug list of gtkpod.
Oh, and it also shows why it is a specially bad idea to prepare a
package for a library broken in that regard without at least changing
the package name...
So, to all the Debian gtkpod users out there: Please don't use libgpod
from debian-multimedia.org
together with gtkpod 0.99.4 from Debian. I will try to make sane
versions of libgpod 0.4.0 and gtkpod 0.99.8 available as soon as
possible. If you want to help with this (the
RFH for gtkpod is still open),
please don't hesitate and drop me a mail.