Friday, July 25, 2008

I've discovered that my seat-of-the-pants install of ATLAS a week or so ago caused a from-scratch build of NumPy to be broken. :/ So I'm now burning up some time doing a correct install of ATLAS/LAPACK, and maybe I'll actually be able to "import numpy" by the end of the day. (Yes, my machine is that slow).

The monkeypatching madness is now gone from numpy.testing, as are the decorators. There have been some minor additions/changes to support SciPy tests (like moving print_assert_equal to NumPy so there don't have to be 3 separate but identical implementations in SciPy) and make the output more informative (show NumPy/SciPy/Python versions and say which tests are being run).

I submitted patches for the __mul__ and __mod__ problems of chararray, since the fixes would actually change the current behavior of the class, and I figure that's best looked at by others. This presumes that somebody's actually using the class.... ;) The 1.1.x development should be done soon, and then there should be plenty of eyes on the 1.2 defects, so the fix should still fit into the GSoC schedule.

Jarrod also asked if I could standardize the NumPy imports (as "import numpy as np") whenever I run across them, so I've been doing that a little this week in SciPy and NumPy. As soon as I can get my ATLAS/LAPACK issue sorted out, and I can be sure I haven't broken anything, I'll be checking some more of those changes in.

I'm also cleaning up the SciPy test output. Currently it's littered with deprecation warnings and (apparently) debugging output from the libraries.

Finally, since the deadline is in sight, I figured I should spend some time on the final documentation. I've got some scripts hacked together to automatically generate HTML tables (for overly detailed reports) and diff files from svn + numpy/scipy-svn mailing lists, which should take care of the bulk of the tedious work (which, for some reason, I did manually last year...).

Thursday, July 17, 2008

SciPy has been switched over to use numpy.testing, scipy.testing has been removed, and it seems like everything is ok with the unit tests. Unlike NumPy, SciPy doesn't import all its subpackages on "import scipy", so the doctests need some slightly different execution context rules. Right now it looks like tests will have the module in which they are declared made available to them. So, for example, a doctest for scipy.linalg.qr will have linalg available in its context (an implicit "from scipy import linalg"). Of course this means there's a ton of SciPy doctests that need to be updated so that they'll run again.

The monkeypatching of the nose doctest plugin in numpy.testing.nosetester, which was originally a method or two, has gotten a little out of hand (as pointed out by others on the numpy-discussion list), so I'm in the process of changing over to use a new plugin (subclassed from the nose doctest plugin, numpy.plugins.doctest.Doctest). I'm also going to replace the decorators used to write the NoseTester method docstrings; since there's no other decorators used in NumPy, I'm not going to be the one that adds them. ;) Hopefully all that should be done tomorrow.

The __mul__ issue for defchararray was interesting; it appears if you pass a NumPy integer type to ndarray.__new__ as the itemsize argument, it uses the itemsize of the integer object rather than the value held by it, so (on my 32-bit machine) this meant that the return array always used an itemsize of 4 bytes. I'm not sure if this behavior is intentional.

Tuesday, July 8, 2008

Alright, so I didn't spend the entire week looking into the NumPy test environment pollution that occurs when doctests are run. I wanted to write some new tests to improve coverage, so I had to finish updating all the docstrings to run under the sanitized namespace, and get a more useful indication of coverage. That was a little tedious, but not difficult. I had to write a short script to generate annotated NumPy source (I didn't see any easy way to make nose do it), since the default coverage output--a list of uncovered line numbers--is impossible to use. For me, anyway.

Since nobody had suggested any specific bits of NumPy that they thought deserved more coverage, I just started slogging through the annotated source and looked for big blocks of uncovered code. There weren't any tests for chararray, so I started there.

The __mul__ operator doesn't seem to act right; string results that should be longer than 4 characters seem to get truncated (the resulting array dtype is S4 for some reason). The __mod__ operator doesn't seem to act like I'd expect either, but at the moment I'm willing to chalk it up to not knowing enough about how NumPy does things. So I'm off to go read more...

Tuesday, July 1, 2008

Test environment pollution

For some wonderful reason, running doctests along with the unit tests pollutes the test environment and causes tests to fail (these same tests pass when doctests are not run). So figuring out this tangled mess of execution context looks to be a significant part of what I'll be doing the rest of the week. Random thoughts and information related to these issues follow. :)

I found that I could specify a "sanitized" globals dictionary for the doctests (in the loadTestsFromModule method that's been monkeypatched onto nose.plugins.doctests.Doctest), which, as far as I can tell, has the desired effect of simulating running the doctests in a Python that's been started up and had "import numpy as np" executed. This helps prevent doctests from depending on things that might have been imported into the module where they live, locally defined functions, etc.

Some of the tests make changes that affect other tests, like the doctests in lib/scimath.py that use numpy.set_printoptions to change the display precision in order to make test output predictable. I can monkeypatch an afterContext method onto nose.plugins.doctests.Doctest to restore the original state, but this seems like a fragile way to do things, since I have to code restoration of everything that might get changed. (At least it works, though.)

Somehow, when doctests are run, the memmap module replaces the memmap class in the execution context of the test_memmap.py tests, and so they fail. I really don't know how that's happening yet.

Monday, June 30, 2008

NumPy doctest customizations committed

Since nobody objected to my warning post, the following features are now available for all NumPy 1.2 doctests:
  1. All doctests in NumPy will have the numpy module available in their execution context as "np".
  2. The normalized whitespace option is enabled for all doctests.
  3. Adding "#random" to expected output will now cause that output to be ignored, but the associated example command will still be executed. So you can now do this to provide a descriptive example but not have it fail:
    >>> random.random()
    0.1234567890 #random: output may differ on your system
Later today I'll commit some changes to existing doctests that take advantage of these changes. Apparently some of the doctests haven't been run in a long time; tensorsolve, for example, referenced an undefined "wrap" function, and there were no unit tests to turn up this problem. :/ We're going to add a flag to the buildbot test command to run the doctests as well from now on, so that might help improve coverage a bit.

By the way, many thanks to whomever installed nose on the NumPy buildbots! :)

Thursday, June 26, 2008

Oops, nevermind about <BLANKLINE>

One should not post first thing in the morning, it seems. Avoiding use of <BLANKLINE> in doctests has nothing to do with whitespace normalization. :)

Doctest tweaking

Based on a long discussion thread, I'm in the midst of hacking up the NumPy test framework to make the following changes:
  1. All doctests in NumPy will have numpy in their execution context as "np" (i.e., there will be an implicit "import numpy as np" statement for all doctests, which makes the sample code shorter).
  2. The "normalize whitespace" option is enabled for all doctests. This one didn't come up in the discussion, but I ran across a few existing examples that fail because of insignificant whitespace differences (like a space after a calculation result), and figured it would save a lot of pain on the part of docstring editors. Especially since the default failure message doesn't show that whitespace is causing the test to fail. (I think this should also avoid having to use the ugly <BLANKLINE> in docstrings, have to check).
  3. Commands that have certain stuff in them will have their output ignored (but they will still be executed). At present the list of triggers is: ['plt.', 'plot.', '#random']. This avoids failures caused by output differences that aren't relevant: plotting function/method calls that return objects which (for some reason) include an address in their repr, examples that generate inherently random results, etc.
I may also turn on the detailed diff output for doctests as well; in some of the more complicated outputs, it's not always easy to spot some tiny difference in output. Come to think of it, the unittest fixture failure output could benefit from this, too, for the same reason.

All these changes will get run by the numpy-discussion list before they're actually committed, of course.

Monday, June 23, 2008

Old test stuff restored

All the old (1.1) test classes are back in NumPy. I can run old-style tests without any troubles locally, and hopefully that will hold for all the user test suites out there.

The module .test() functions will now also take the old arguments (which fortunately were passed in as keyword arguments in all the cases I've seen so far), and warn that they should be updated before the next release. As before, test() returns a TextTestResult; this required a little bit of tweaking on nose, because the test result object is discarded by default. Apparently nobody ever needed it back before, and that's why it's not kept. This might be changed in nose at some point in the future.

Unfortunately there's no way to tell if the new test suite works on the various buildbot machines, since none of them apparently have nose installed. :(

I managed to inadvertently add coverage support when I made my first big checkin. Playing around with experimental stuff in your working directory at the last minute FTL. :/ Anyway, Robert Kern was nice enough to point out some points on which it was lacking, and those are fixed. Running numpy.test(coverage=True) will now report coverage limited to NumPy, and numpy.core.test will limit coverage reporting to numpy.core, etc.

A quick test shows that it's not hard to change SciPy to use numpy.testing, allowing the removal of the code in scipy/testing.

Tuesday, June 17, 2008

NumPy has switched to nose

Ok, so I checked in my nose changes...and all the buildbots are red, and people's code is broken. :( So I have to do what I should have done in the first place and make sure tests written with the old test framework will still function in 1.2.

At least the switch turned up some useful information: since nose picks up tests that weren't being run by the original test suite, it uncovered some old code and tests in numpy/f2py/lib/parser which can probably be deleted. Well, maybe; still waiting on a definite answer on that. Deleting the two files (Fortran2003.py and test_Fortran2003.py) doesn't seem to cause any problems, and gets rid of some verbose test output.

Saturday, June 14, 2008

NumPy nose migration

Ok, now that finals are over, the only disruptions in my day should be realtors showing potential buyers around, and the occasional packing spree.

I'm currently reviewing the changes to switch NumPy to nose before I check them in. Right now my local numpy.test() picks up and runs 1656 tests, with no failures or errors. Since I'm touching a lot of files, I figure a final scan over the diffs can't hurt. I already found a couple of tests that weren't being run under the old framework because they weren't defined correctly.

I had to add --exclude arguments to ignore the directories below numpy/numpy/distutils/tests (f2py_ext, etc.), since those little test extension modules don't get built by default. These tests weren't run by the old test suite anyway, but maybe it would be nice to see if they can be included later on.

I really like that nose has coverage support (many thanks to whoever added that). I wonder how hard it would be to implement some sort of coverage diff to let people check how a patch or local modifications to their NumPy tree will affect coverage.

Sunday, June 8, 2008

Avoiding __test__ usage

I ran across some existing tests in NumPy today that made me realize there's no particular reason that classes like LinalgTestCase need to be derived from unittest.TestCase. Not sure why that didn't occur to me earlier. Anyway, if the classes with type-specific test methods don't derive from TestCase, nose won't try to run them. I think that this:
class LinalgTestCase:
def test_single(self):
a = array([[1.,2.], [3.,4.]], dtype=single)
b = array([2., 1.], dtype=single)
self.do(a, b)

class test_inv(LinalgTestCase, TestCase):
def do(self, a, b):
a_inv = linalg.inv(a)
assert_almost_equal(dot(a, a_inv), identity(a.shape[0]))

is much cleaner-looking than the __test__ solution from June 2nd.

Friday, June 6, 2008

Just so it's somewhere...

Since I don't know where else to put this, it's going here for now. There was a need for a list of all the functions in NumPy that took an "out" parameter:

http://alanmcintyre.webfactional.com/numpy-out-func-like.txt
http://alanmcintyre.webfactional.com/scipy-out-func-like.txt

The horrible script I used to generate these lists is here (just replace "numpy" with "scipy" in the call to handle_module to generate the SciPy list):

http://alanmcintyre.webfactional.com/out-finder.py

Thursday, June 5, 2008

Well that's nice...

Turns out I was mistaken about the requirements for an independent study I took this quarter. I thought I was on the hook to write a term paper, but the weekly work I had been assigned was intended to cover the requirements. So that makes for a less stressful end of the quarter (and more time for GSoC!).

Hrm...had a brownout this morning and now the Linux machine I had set aside for GSoC seems to be missing some things..like the ability to use its network card. :/ Hopefully it's a software problem that won't take long to fix.

Update: Bah, our router apparently doesn't like doing the DHCP job any more, so I just manually set the IP address. At least everything seems to work ok with that.

Monday, June 2, 2008

Test inelegance

There's some NumPy tests that use a class derived from unittest.TestCase to contain a set of tests that need to be run on multiple types, and then inherit from that class to implement the generic test, like this:

class LinalgTestCase(NumpyTestCase):
def test_single(self):
a = array([[1.,2.], [3.,4.]], dtype=single)
b = array([2., 1.], dtype=single)
self.do(a, b)

def test_double(self):
a = array([[1.,2.], [3.,4.]], dtype=double)
b = array([2., 1.], dtype=double)
self.do(a, b)


class test_inv(LinalgTestCase):
def do(self, a, b):
a_inv = linalg.inv(a)
assert_almost_equal(dot(a, a_inv), identity(a.shape[0]))


It seems that nose wants to run the LinalgTestCase class directly, since it's derived from TestCase, which of course raises silly errors. So for the time being I'm using the __test__ attribute to force nose to run the correct class:

class LinalgTestCase(NumpyTestCase):
__test__ = False
def test_single(self):
a = array([[1.,2.], [3.,4.]], dtype=single)
b = array([2., 1.], dtype=single)
self.do(a, b)

def test_double(self):
a = array([[1.,2.], [3.,4.]], dtype=double)
b = array([2., 1.], dtype=double)
self.do(a, b)


class test_inv(LinalgTestCase):
__test__ = True
def do(self, a, b):
a_inv = linalg.inv(a)
assert_almost_equal(dot(a, a_inv), identity(a.shape[0]))

This seems like an ugly way to make the tests run properly, but it works for the moment. If I can find a prettier method later I'll switch to it.

Friday, May 30, 2008

Shameless copying in progress

I'm currently swiping the nose test setup from SciPy and wedging it into NumPy. So far I've copied over the nose-related modules from scipy/testing, tweaked them a little, renamed a lot of NumPy tests from "check_something" to "test_something" (nose is currently using test match regex "^test_*", which, coupled with the method renaming, seems to keep it from trying to run things it shouldn't), and test class fixtures inherit from unittest.TestCase instead of NumpyTestCase.

At the moment the nose-ified NumPy test suite picks up 1749 tests; 48 of these have errors and 12 fail. Some of those look like actual test failures, but most of them are clearly a result of the tests expecting special behavior that used to be provided by NumpyTestCase.

A full test run on a clean build of NumPy from the svn trunk finds 1275 tests with no errors and no failures, so I apparently have some cleanup/checking to do yet before numpy.test() should work correctly. I already stumbled across a few test modules that imported other test modules so that they got run correctly under the old framework, so there's probably some of that going on somewhere, giving a test count greater than the total number of unique tests.

Thursday, May 29, 2008

Just so it's not empty...

So this is my shiny new blog that will be used for my Google Summer of Code 2008 project and other NumPy-related stuff. And now it's not devoid of posts. :)

Up to this point I've just been reviewing NumPy and SciPy Trac tickets, and commenting on or fixing those that I could learn enough about to be helpful.