CI jobs using npm might suffer from a 10 minutes delay

classic Classic list List threaded Threaded
5 messages Options
Reply | Threaded
Open this post in threaded view
|

CI jobs using npm might suffer from a 10 minutes delay

Antoine Musso-3
Hello,

Since June 27th, any CI job running 'npm install' might suffer from a 10
minutes extra delay.

Somehow when requesting package informations from the NpmJS CDN
(CloudFlare), the connection holds for ten minutes.  npm just idles
waiting for a reply. Then eventually it shows:

  npm ERR! registry error parsing json

npm then retry and process as usual.


The json error is due to a CloudFlare HTML page stating:

 The page could not be rendered due to a temporary fault.


The impact is any Jenkins job using npm have a high chance of taking 10
more minutes to build.  That notably impacts MediaWiki core and all its
extensions.


A few minutes ago, I have made a change to run npm with --loglevel=info
which would give some hints about what it is doing by causing npm to
emit more informations in the console.  (verbose would be way too much
log though).

I have filled a bug to npm: https://github.com/npm/npm/issues/21101
Our task: https://phabricator.wikimedia.org/T198348


I have no idea how to mitigate the issue :-(

--
Antoine "hashar" Musso


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: CI jobs using npm might suffer from a 10 minutes delay

David Barratt
If we were to upgrade npm to 5+ (I think) that would support
package-lock.json
https://docs.npmjs.com/files/package-lock.json

That file would be committed to our code repository and allow the CI to
skip the dependency resolution step(s). This means that the tar/zip or
clone would happen directly without involving registry.npmjs.org at all.

Upgrading npm (and adding a package-lock.json file) might be outside of the
scope of this issue, but I think it would help mitigate/resolve the problem.

On Thu, Jun 28, 2018 at 10:26 AM Antoine Musso <[hidden email]> wrote:

> Hello,
>
> Since June 27th, any CI job running 'npm install' might suffer from a 10
> minutes extra delay.
>
> Somehow when requesting package informations from the NpmJS CDN
> (CloudFlare), the connection holds for ten minutes.  npm just idles
> waiting for a reply. Then eventually it shows:
>
>   npm ERR! registry error parsing json
>
> npm then retry and process as usual.
>
>
> The json error is due to a CloudFlare HTML page stating:
>
>  The page could not be rendered due to a temporary fault.
>
>
> The impact is any Jenkins job using npm have a high chance of taking 10
> more minutes to build.  That notably impacts MediaWiki core and all its
> extensions.
>
>
> A few minutes ago, I have made a change to run npm with --loglevel=info
> which would give some hints about what it is doing by causing npm to
> emit more informations in the console.  (verbose would be way too much
> log though).
>
> I have filled a bug to npm: https://github.com/npm/npm/issues/21101
> Our task: https://phabricator.wikimedia.org/T198348
>
>
> I have no idea how to mitigate the issue :-(
>
> --
> Antoine "hashar" Musso
>
>
> _______________________________________________
> Wikitech-l mailing list
> [hidden email]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: [QA] CI jobs using npm might suffer from a 10 minutes delay

Željko Filipin
On Thu, Jun 28, 2018 at 4:42 PM David Barratt <[hidden email]>
wrote:

> If we were to upgrade npm to 5+ (I think) that would support
> package-lock.json
>

That is discussed in https://phabricator.wikimedia.org/T179229

Željko
_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: CI jobs using npm might suffer from a 10 minutes delay

Antoine Musso-3
In reply to this post by Antoine Musso-3
On 28/06/2018 16:25, Antoine Musso wrote:

> Hello,
>
> Since June 27th, any CI job running 'npm install' might suffer from a 10
> minutes extra delay.
>
> Somehow when requesting package informations from the NpmJS CDN
> (CloudFlare), the connection holds for ten minutes.  npm just idles
> waiting for a reply. Then eventually it shows:
>
>   npm ERR! registry error parsing json
<snip>
> Our task: https://phabricator.wikimedia.org/T198348

Npmjs seems to have implemented a fix although we are still hitting the
issue:  https://status.npmjs.org/incidents/51c7q80zsj9f


A few minutes ago, I have bumped the default timeout from 30 minutes to
45 minutes.  So jobs will still be slow, but at least they should
succeed (when they should).

https://gerrit.wikimedia.org/r/#/c/integration/config/+/442988/


--
Antoine Musso


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Reply | Threaded
Open this post in threaded view
|

Re: CI jobs using npm might suffer from a 10 minutes delay

Antoine Musso-3
On 28/06/2018 23:28, Antoine Musso wrote:

>>   npm ERR! registry error parsing json
> <snip>
>> Our task: https://phabricator.wikimedia.org/T198348
> Npmjs seems to have implemented a fix although we are still hitting the
> issue:  https://status.npmjs.org/incidents/51c7q80zsj9f
>
>
> A few minutes ago, I have bumped the default timeout from 30 minutes to
> 45 minutes.  So jobs will still be slow, but at least they should
> succeed (when they should).
>
> https://gerrit.wikimedia.org/r/#/c/integration/config/+/442988/

Hello,

The issue from June 28th has been resolved but appeared again today.


CI jobs using npm once again are showing the delay issue. Same symptom:

 npm ERR! registry error parsing json


I have bumped the job timeout again from 30 minutes to 45 minutes and
reopen the task https://phabricator.wikimedia.org/T198348


--
Antoine "hashar" Musso


_______________________________________________
Wikitech-l mailing list
[hidden email]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l