doc: ensure sphinx output is reproducible

Message ID 20230629125838.1995751-1-christian.ehrhardt@canonical.com (mailing list archive)
State New
Delegated to: Thomas Monjalon
Headers
Series doc: ensure sphinx output is reproducible |

Checks

Context Check Description
ci/checkpatch success coding style OK
ci/loongarch-compilation success Compilation OK
ci/loongarch-unit-testing success Unit Testing PASS
ci/intel-Testing success Testing PASS
ci/iol-mellanox-Performance success Performance Testing PASS
ci/Intel-compilation success Compilation OK
ci/iol-x86_64-compile-testing success Testing PASS
ci/intel-Functional success Functional PASS
ci/iol-testing success Testing PASS
ci/github-robot: build success github build: passed
ci/iol-aarch-unit-testing success Testing PASS
ci/iol-abi-testing success Testing PASS
ci/iol-x86_64-unit-testing success Testing PASS
ci/iol-unit-testing success Testing PASS
ci/iol-broadcom-Performance success Performance Testing PASS
ci/iol-broadcom-Functional success Functional Testing PASS
ci/iol-intel-Functional success Functional Testing PASS
ci/iol-intel-Performance success Performance Testing PASS
ci/iol-aarch64-compile-testing success Testing PASS

Commit Message

Christian Ehrhardt June 29, 2023, 12:58 p.m. UTC
  From: Christian Ehrhardt <christian.ehrhardt@canonical.com>

By adding -j we build in parallel, to make building on multiprocessor
machines more effective. While that works it does also break
reproducible builds as the order of the sphinx generated searchindex.js
is depending on execution speed of the individual processes.

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
---
 buildtools/call-sphinx-build.py | 5 -----
 1 file changed, 5 deletions(-)
  

Comments

Christian Ehrhardt June 29, 2023, 1:02 p.m. UTC | #1
On Thu, Jun 29, 2023 at 2:58 PM <christian.ehrhardt@canonical.com> wrote:
>
> From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
>
> By adding -j we build in parallel, to make building on multiprocessor
> machines more effective. While that works it does also break
> reproducible builds as the order of the sphinx generated searchindex.js
> is depending on execution speed of the individual processes.

Just FYI (this didn't fit fit well in the commit message) an example
of such a fail can be seen at
https://salsa.debian.org/paelzer-guest/dpdk/-/jobs/4372883

If you download the artifact, extract dpdk-doc, apply js-beautify for
readability and then diff it you'll find it
same-content-different-order.
Examples of two builds:
- https://paste.ubuntu.com/p/VhWYNRv7kN/
- https://paste.ubuntu.com/p/KcQk4Km9xM/

> Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> ---
>  buildtools/call-sphinx-build.py | 5 -----
>  1 file changed, 5 deletions(-)
>
> diff --git a/buildtools/call-sphinx-build.py b/buildtools/call-sphinx-build.py
> index 39a60d09fa..d8879306de 100755
> --- a/buildtools/call-sphinx-build.py
> +++ b/buildtools/call-sphinx-build.py
> @@ -15,12 +15,7 @@
>  # set the version in environment for sphinx to pick up
>  os.environ['DPDK_VERSION'] = version
>
> -# for sphinx version >= 1.7 add parallelism using "-j auto"
> -ver = run([sphinx, '--version'], stdout=PIPE,
> -          stderr=STDOUT).stdout.decode().split()[-1]
>  sphinx_cmd = [sphinx] + extra_args
> -if Version(ver) >= Version('1.7'):
> -    sphinx_cmd += ['-j', 'auto']
>
>  # find all the files sphinx will process so we can write them as dependencies
>  srcfiles = []
> --
> 2.41.0
>
  
Thomas Monjalon July 3, 2023, 3:29 p.m. UTC | #2
29/06/2023 14:58, christian.ehrhardt@canonical.com:
> From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> 
> By adding -j we build in parallel, to make building on multiprocessor
> machines more effective. While that works it does also break
> reproducible builds as the order of the sphinx generated searchindex.js
> is depending on execution speed of the individual processes.
[...]
> -if Version(ver) >= Version('1.7'):
> -    sphinx_cmd += ['-j', 'auto']

What is the impact on build speed on an average machine?
  
Christian Ehrhardt July 6, 2023, 12:49 p.m. UTC | #3
On Mon, Jul 3, 2023 at 5:29 PM Thomas Monjalon <thomas@monjalon.net> wrote:
>
> 29/06/2023 14:58, christian.ehrhardt@canonical.com:
> > From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> >
> > By adding -j we build in parallel, to make building on multiprocessor
> > machines more effective. While that works it does also break
> > reproducible builds as the order of the sphinx generated searchindex.js
> > is depending on execution speed of the individual processes.
> [...]
> > -if Version(ver) >= Version('1.7'):
> > -    sphinx_cmd += ['-j', 'auto']
>
> What is the impact on build speed on an average machine?

Hi,
I haven't tested this in isolation as it was just a mandatory change
on the Debian/Ubuntu side.
And the time for exactly and only the doc build is hidden inside the
concurrency of meson.
But I can compare a full build [1] and a full build with the change [2].

That is an average build machine and it is 35 seconds slower with the
change to no more do doc builds in parallel.

[1]: https://launchpadlibrarian.net/673520160/buildlog_ubuntu-mantic-amd64.dpdk_22.11.2-2_BUILDING.txt.gz
[2]: https://launchpadlibrarian.net/674783718/buildlog_ubuntu-mantic-amd64.dpdk_22.11.2-3_BUILDING.txt.gz
  
Thomas Monjalon Nov. 27, 2023, 4:45 p.m. UTC | #4
06/07/2023 14:49, Christian Ehrhardt:
> On Mon, Jul 3, 2023 at 5:29 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> >
> > 29/06/2023 14:58, christian.ehrhardt@canonical.com:
> > > From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > >
> > > By adding -j we build in parallel, to make building on multiprocessor
> > > machines more effective. While that works it does also break
> > > reproducible builds as the order of the sphinx generated searchindex.js
> > > is depending on execution speed of the individual processes.
> > [...]
> > > -if Version(ver) >= Version('1.7'):
> > > -    sphinx_cmd += ['-j', 'auto']
> >
> > What is the impact on build speed on an average machine?
> 
> Hi,
> I haven't tested this in isolation as it was just a mandatory change
> on the Debian/Ubuntu side.
> And the time for exactly and only the doc build is hidden inside the
> concurrency of meson.
> But I can compare a full build [1] and a full build with the change [2].
> 
> That is an average build machine and it is 35 seconds slower with the
> change to no more do doc builds in parallel.

I would prefer adding an option for reproducible build
(which is not a common requirement).
  
Bruce Richardson Nov. 27, 2023, 5 p.m. UTC | #5
On Mon, Nov 27, 2023 at 05:45:52PM +0100, Thomas Monjalon wrote:
> 06/07/2023 14:49, Christian Ehrhardt:
> > On Mon, Jul 3, 2023 at 5:29 PM Thomas Monjalon <thomas@monjalon.net> wrote:
> > >
> > > 29/06/2023 14:58, christian.ehrhardt@canonical.com:
> > > > From: Christian Ehrhardt <christian.ehrhardt@canonical.com>
> > > >
> > > > By adding -j we build in parallel, to make building on multiprocessor
> > > > machines more effective. While that works it does also break
> > > > reproducible builds as the order of the sphinx generated searchindex.js
> > > > is depending on execution speed of the individual processes.
> > > [...]
> > > > -if Version(ver) >= Version('1.7'):
> > > > -    sphinx_cmd += ['-j', 'auto']
> > >
> > > What is the impact on build speed on an average machine?
> > 
> > Hi,
> > I haven't tested this in isolation as it was just a mandatory change
> > on the Debian/Ubuntu side.
> > And the time for exactly and only the doc build is hidden inside the
> > concurrency of meson.
> > But I can compare a full build [1] and a full build with the change [2].
> > 
> > That is an average build machine and it is 35 seconds slower with the
> > change to no more do doc builds in parallel.
> 
> I would prefer adding an option for reproducible build
> (which is not a common requirement).
> 
Taking a slightly different tack, is it possible to sort the searchindex.js
file post-build, so that even reproducible builds get the benefits of
parallelism?

/Bruce
  

Patch

diff --git a/buildtools/call-sphinx-build.py b/buildtools/call-sphinx-build.py
index 39a60d09fa..d8879306de 100755
--- a/buildtools/call-sphinx-build.py
+++ b/buildtools/call-sphinx-build.py
@@ -15,12 +15,7 @@ 
 # set the version in environment for sphinx to pick up
 os.environ['DPDK_VERSION'] = version
 
-# for sphinx version >= 1.7 add parallelism using "-j auto"
-ver = run([sphinx, '--version'], stdout=PIPE,
-          stderr=STDOUT).stdout.decode().split()[-1]
 sphinx_cmd = [sphinx] + extra_args
-if Version(ver) >= Version('1.7'):
-    sphinx_cmd += ['-j', 'auto']
 
 # find all the files sphinx will process so we can write them as dependencies
 srcfiles = []