Skip to content

Conversation

@fingolfin
Copy link
Member

This is WIP, more details will be added when I actually managed to do something useful

@fingolfin fingolfin added do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements topic: tests issues or PRs related to tests labels Jul 17, 2017
@codecov
Copy link

codecov bot commented Jul 17, 2017

Codecov Report

Merging #1512 into master will decrease coverage by 12.77%.
The diff coverage is 0%.

@@             Coverage Diff             @@
##           master    #1512       +/-   ##
===========================================
- Coverage   63.68%    50.9%   -12.78%     
===========================================
  Files         946      438      -508     
  Lines      284375   232012    -52363     
  Branches    12725    10446     -2279     
===========================================
- Hits       181094   118108    -62986     
- Misses     100488   111119    +10631     
+ Partials     2793     2785        -8
Impacted Files Coverage Δ
lib/test.gi 44.76% <0%> (-2.74%) ⬇️
lib/attr.gi 0% <0%> (-100%) ⬇️
lib/teachm2.g 0% <0%> (-100%) ⬇️
lib/contfrac.gi 0% <0%> (-89.48%) ⬇️
lib/wpobj.gi 8.69% <0%> (-89.14%) ⬇️
grp/imf.gi 0% <0%> (-84.96%) ⬇️
lib/proto.gi 0% <0%> (-83.34%) ⬇️
lib/ctblauto.gi 0% <0%> (-82.41%) ⬇️
lib/lierep.gi 0.04% <0%> (-82.26%) ⬇️
lib/ctbllatt.gi 0% <0%> (-80.72%) ⬇️
... and 824 more

@fingolfin fingolfin force-pushed the mh/travis-testpackages branch 11 times, most recently from 3fa2041 to 69762ae Compare July 20, 2017 15:20
@fingolfin fingolfin force-pushed the mh/travis-testpackages branch from 10a647a to a23f90a Compare July 20, 2017 17:47
@fingolfin fingolfin force-pushed the mh/travis-testpackages branch 3 times, most recently from d572e97 to 534b3a9 Compare August 1, 2017 11:28
@fingolfin
Copy link
Member Author

Yay, this finally fails in the right way ;-). Namely, while running package tests. I get test failures for these packages on Travis:

  • anupq: not sure what's wrong there; perhaps caused by missing -x 80, which I'll add now (but doesn't look like it to me)
  • Browse: seems to have problems caused by lack of internet access
  • cubefree: just hangs too long and causes Travis to kill the build...

@alex-konovalov does cubefree run fine on Jenkins?

@olexandr-konovalov
Copy link
Member

@fingolfin cubefree runs fine in stable release in Docker container: https://travis-ci.org/gap-system/gap-docker-pkg-tests/jobs/219241646. On Jenkins in master branch it also runs fine, for example:

============================OUTPUT START==============================
#I  RunPackageTests("cubefree", "1.16", "tst/autoTest.tst", auto);
 ���������������������������   GAP 4.dev of today
 ���  GAP  ���   https://www.gap-system.org
 ���������������������������   Architecture: x86_64-pc-linux-gnu-gcc-default64
 Libs used:  gmp 6.0.0, readline
 Loaded workspace: wsp.g
 Components: trans 1.0, prim 3.0, small* 1.0, id* 1.0
 Packages:   AClib 1.2, Alnuth 3.0.0, AtlasRep 1.5.1, AutPGrp 1.8, 
             Carat 2.1.6, CRISP 1.4.4, Cryst 4.1.12, CrystCat 1.1.6, 
             CTblLib 1.2.2, Cubefree 1.16, FactInt 1.5.4, FGA 1.3.1, 
             GAPDoc 1.5.1, GrpConst 2.5, IO 4.4.6, IRREDSOL 1.3.1, 
             LAGUNA 3.7.0, Polenta 1.3.7, Polycyclic 2.11, RadiRoot 2.7, 
             ResClasses 4.6.0, Sophus 1.23, SpinSym 1.5, TomLib 1.2.6, 
             Utils 0.46
 Try '??help' for help. See also '?copyright', '?cite' and '?authors'

# line 1 of 6 (17%)
# line 2 of 6 (33%)
# line 3 of 6 (50%)
# line 6 of 6 (100%)
                                    
Compare with SmallGroups Library
msecs: 141223
#I  No errors detected while testing package cubefree 1.16
#I  using the test file `tst/autoTest.tst'
#I  RunPackageTests("cubefree", "1.16", "tst/autoTest.tst", auto): runtime 
141378
============================OUTPUT END================================

@olexandr-konovalov
Copy link
Member

olexandr-konovalov commented Aug 1, 2017

@fingolfin see this newly created wiki page for the status of tests in the master branch: https://github.com/gap-system/gap/wiki/Package-tests-in-the-release-candidate-for-the-next-major-release. Also, if you need logs, please ask me. There will be an update of this table tonight or tomorrow.

@olexandr-konovalov
Copy link
Member

@fingolfin just in case, are you aware of the documented function TestPackage - could it be of any help? In GAP 4.8, also callable via make testpackage PKGNAME=pkgname (make target may need a fix to work in GAP 4.9 - haven't checked).

@fingolfin
Copy link
Member Author

I just run the cubefree tests on my machine -- and they simply are super slow, so the time out happens because of that. I guess I'll just have to skip them for now.

@alex-konovalov Ah, no, I wasn't aware of TestPackage, thanks for the hint. Unfortunately it is useless to me right now, because it has no return value, so I have no sane way to determine its result. But it shouldn't be hard to fix that.

@olexandr-konovalov
Copy link
Member

olexandr-konovalov commented Aug 1, 2017

shouldn't be hard to fix that

indeed, except for packages that still do not conform to the latest advances in test organisation.

@fingolfin
Copy link
Member Author

except for packages that still do not conform to the latest advances in test organisation

You are absolutely right. On the short term, we may have to resort to insane output text parsing for those after all (and even then, my understanding is that some don't produce helpful results, right?).

On the long run, I want to identify all such packages, and get their maintainers to improve that. On the really long run, we might even threaten to remove non-compliant packages... But that's in the far future, one step after the other.

@olexandr-konovalov
Copy link
Member

There are different grades of non-compliance here. In no particular order:

  1. Having no tests at all (can we then extract examples from GAPDoc based manuals and run them?)
  2. Having some test files, but not specifying a default test in PackageInfo.g
  3. Having a .tst file specified in PackageInfo.g, but running with diffs which hardly could be interpreted by a non-author of the package
  4. Having a .g script which still calls ReadTest instead of Test.
  5. Having a test file which just produces output instead of reporting only diffs
    etc.

@fingolfin
Copy link
Member Author

@alex-konovalov Here is my view on that:

  1. no tests: bad, but not a bug. Though for new packages (deposited or submitted), I recommend that we demand some minimal tests.
  2. tests, but not hooked up in PackageInfo.g: see 1.
  3. test always produce diffs: that's a bug, that must be resolved (could be a bug in the package, or a regression caused by GAP, or ...). There is no point in second guessing these things. The package maintainers should take care of them.
  4. Using ReadTest: not a bug, though we should urge maintainers to switch to Test and TestDirectory, and demand it for new submissions.
  5. No scriptable way to detect test failure: see 1
  6. Test is a .g file but does not use exit code to signal test status: see 5. resp. 1.

@fingolfin
Copy link
Member Author

fingolfin commented Aug 2, 2017

So most package tests were run now by Travis. Note to self: We should perhaps have variants of this job which use -A, and/or use LoadAllPackages. Some notes on package tests:

  • alnuth: uses ReadTest; should use TestDirectory
  • anupq: lots of diffs; tests run over 7 minutes (!)
  • atlasrep: lots of diffs; tests run over 3 minutes
  • browse: diffs; tests run over 2 minutes
  • cubefree: test takes too long
  • circle: should use TestDirectory
  • congruence: should use TestDirectory
  • crime: test file is named test.g, but is actually a .tst file
  • crisp: tests are fine, but run for over 2 minutes
  • ctblib: tests fail (lots of diffs); runs over 6 minutes
  • cvec: tests fails (regression caused by change in master branch, fixed in 2.5.7)
  • digraphs: fails to load, "missing symbol atexit", huh?
  • fga: uses ReadTest; should use TestDirectory
  • fining: fails, needs cvec 2.5.7 to fix it
  • float: fails to load, "missing symbol atexit", huh?
  • forms: uses ReadTest; should use TestDirectory
  • fplsa: uses ReadTest; should use TestDirectory
  • fr: uses ReadTest; should use TestDirectory
  • gbnp: test fails due to package having been loaded before test runs
  • gpd: should use TestDirectory
  • guarana: diffs; uses ReadTest; should use TestDirectory
  • hap: diffs
  • hapcryst: diffs
  • happrime: uses ReadTest; should use TestDirectory
  • irredsol: tests pass; but take 3+ minutes; doesn't use TestDirectory (but development version does)
  • itc: cannot be loaded
  • laguna: lots of diffs; tests run 5+ minutes
  • loops: tests seem to pass; uses ReadTest; should use TestDirectory
  • lpres: tests seem to pass; should use TestDirectory
  • modisom: lots of diffs
  • openmath: should use TestDirectory
  • patternclass: uses ReadTest; should use TestDirectory
  • polenta: should use TestDirectory
  • polycyclic: lots of diffs; uses ReadTest; should use TestDirectory
  • polymaking: lots of diffs; uses ReadTest; should use TestDirectory
  • radiroot: uses ReadTest; should use TestDirectory
  • rcwa: diffs
  • resclasses: diffs
  • semigroups: diffs
  • simcomps: diffs
  • unitlib: error: "IntHexString: invalid character in hex-string"
  • utils: diffs
  • wedderga: diffs
  • xmod: diffs

At this point, I don't get farther, because the test tends to exceed the maximum time Travis allows for tests (seems to b 50 minutes). So we may need to

  • speed up tests,
  • split this into multiple jobs, say for letters A-F, then G-?? etc.

@olexandr-konovalov
Copy link
Member

Updated autogenerated table at https://github.com/gap-system/gap/wiki/Package-tests-in-the-release-candidate-for-the-next-major-release - now with the results of the tests with all packages loaded as well.

@fingolfin fingolfin force-pushed the mh/travis-testpackages branch from c908e70 to fafcd06 Compare August 21, 2017 16:28
@fingolfin fingolfin force-pushed the mh/travis-testpackages branch 2 times, most recently from cd5ee26 to 7526d40 Compare August 21, 2017 16:33
@fingolfin
Copy link
Member Author

I have now put parts of this into PR #1610. The problem with package tests is that they still take too long to run. We could split this test into multiple, but it seems wasteful to re-compile GAP again and again.

But Travis has a new (beta) feature called build stages, which perhaps we could use to solve this: With build stages, we can group and run jobs into multiple "stages": Jobs in the same stage are run in parallel, but stages are run sequentially.

We could thus put all the testinstall etc. jobs into one stage. Then, once those completed, we run a second stage which builds all packages (alternatively, this stage could be part of the first stage). Then, once that stage has completed, we run a stage which contains jobs that run package jobs (grouped in some way). The winning part: We can cache the results of jobs across stages, i.e. the "build all packages" job could cache the output, and then the various "test packages" jobs could reuse that, avoiding the overhead for compiling all packages in each "test packages" job.

Alternatively, we could also use that Travis allows us to build docker images somehow, but I am not (yet) familiar with that, and so don't know if it would actually work for us.

@fingolfin fingolfin force-pushed the mh/travis-testpackages branch from 7526d40 to 462e125 Compare September 14, 2017 19:15
@fingolfin fingolfin force-pushed the mh/travis-testpackages branch from 462e125 to ad2dbb3 Compare November 14, 2017 17:36
@olexandr-konovalov
Copy link
Member

@fingolfin how much of this PR we would like to keep? We now have tests of packages that use regularly built docker containers (see Travis badges at https://github.com/gap-system/gap-distribution), where there is a separate Travis test for each package, and for each package that have a standard test (i.e. the one specified in PackageInfo.g) its standard tests will be run three times - without, with all, and with default packages loaded.

@fingolfin
Copy link
Member Author

I still find tests which are run directly on each PR much more useful than running tests separately (which means that regressions will only be noticed after a PR got merged, i.e. "when it is too late).

However, with Travis' (understandable) limitations, this is not realistic at this time, so I am fine with closing this.

@fingolfin fingolfin closed this Jan 23, 2018
@olexandr-konovalov
Copy link
Member

Totally agree that tests before are better than tests after, but certainly with Travis limitations we have to prioritise.

At least with those tests after, we run them next day; they are public so we can point others to them; and there are individual jobs per package, so convenient to see their status per package.

@fingolfin fingolfin deleted the mh/travis-testpackages branch January 23, 2018 20:15
@olexandr-konovalov
Copy link
Member

olexandr-konovalov commented Jan 25, 2018

@fingolfin perhaps a sensible extension would be to check after building packages that packages are loadable. A variation of make testpackagesload would do the job. That does not take as much time as running actual package tests specified in their PackageInfo.g files, and exercises each package. We still can tweak the test to omit packages that we do not expect to load. I will try to make a PR.

Update: see now #2119

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements topic: tests issues or PRs related to tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants