Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errored QueryStream causes Client to throw on message receive #1674

Closed
rkaw92 opened this issue Jun 13, 2018 · 7 comments
Closed

Errored QueryStream causes Client to throw on message receive #1674

rkaw92 opened this issue Jun 13, 2018 · 7 comments
Labels

Comments

@rkaw92
Copy link

rkaw92 commented Jun 13, 2018

Hi,
I have encountered a really bizarre issue when using knex with pg and query streams. A parameterized QueryStream which fails in the parsing stage (e.g. because of a column type / value type mismatch) breaks subsequent queries on the same connection - either by:

  • throwing errors on incoming data packets from Postgres, terminating the Node process like this:
/home/thewanderer/Devel/Node/pg-prepared-statements/node_modules/pg/lib/client.js:212
    self.activeQuery.handleRowDescription(msg)
                    ^
TypeError: Cannot read property 'handleRowDescription' of null
    at .<anonymous> (/home/thewanderer/Devel/Node/pg-prepared-statements/node_modules/pg/lib/client.js:212:21)
    at emitOne (events.js:101:20)
    at emit (events.js:188:7)
    at Socket.<anonymous> (/home/thewanderer/Devel/Node/pg-prepared-statements/node_modules/pg/lib/connection.js:123:12)
    at emitOne (events.js:96:13)
    at Socket.emit (events.js:188:7)
    at readableAddChunk (_stream_readable.js:177:18)
    at Socket.Readable.push (_stream_readable.js:135:10)
    at TCP.onread (net.js:542:20)
  • or by messing with the output of subsequent queries - queries receive empty result.rows (this is much harder to reproduce and will not be covered here, but it occurs in one of our apps using knex)

Below are two ways to reproduce this issue, one using knex and another simplified to bare pg (which simulates how knex interacts with pg).

Test conditions:

  • A table in the DB with a column of type "uuid": CREATE TABLE events (id UUID);
    • Note: if the table does not exist, this bug also occurs the same way.
  • pg@7.4.3
  • pg-query-stream@1.1.1
  • knex@0.14.6 (optional, for the knex-enabled test)
  • PostgreSQL 9.6.7 (this probably does not matter)
'use strict';

function runKnex() {
	const knex = require('knex');
	const db = knex({
		client: 'pg',
		connection: {
			host: process.env.PGHOST,
			user: process.env.PGUSER,
			password: process.env.PGPASSWORD,
			database: process.env.PGDATABASE
		},
		pool: {
			min: 1,
			max: 1
		}
	});
	const resultStream = db('events').select('*').where({ id: 'not a UUID' }).stream();
	resultStream.on('data', function() {});

	return db.raw('SELECT NOW() AS now').then(function(rows) {
		console.log('now: %s', rows[0].now);
	});
	resultStream.on('error', function(error) {
		console.error('stream error:', error);
	});
}

function runPg() {
	const Client = require('pg').Client;
	const QueryStream = require('pg-query-stream');
	const client = new Client();
	return client.connect().then(function() {
		const querySpec = new QueryStream('SELECT * FROM events WHERE id = $1', [ 'not a UUID' ]);
	 	const rawStream = client.query(querySpec);
		rawStream.on('error', function(error) {
			console.log('stream error:', error);
			rawStream.close();
		});
		rawStream.on('data', function() {});

		client.query('SELECT NOW() AS now').then(function(result) {
			console.log('now: %s', result.rows[0].now);
		});
	}).catch(function(error) {
		console.error(error);
		process.exit(5);
	});
}

runPg();
// or alternatively:
// runKnex();

What should happen:

  • A "stream error" should be displayed about invalid UUID input syntax (from the first query)
  • The current date/time should be logged (from the second query)

What actually happens

  • A stream error is displayed (from the first query)
  • The process crashes with TypeError: Cannot read property 'handleRowDescription' of null, as described at the beginning of this bug report

It does not crash if:

  • The id param is a valid UUID string like 8b0fb8cb-b145-4e41-a847-dfe2f1a47aa1 (the second query then also runs OK)
  • We remove rawStream.close(); from our stream error handler (but knex.js does it automatically)

Overall, I think some state machine inside pg.Client breaks, and activeQuery is set to null because of it, but I have not been able to pin-point exactly where or why it happens. It seems to be connected to closing the ReadableStream returned from client.query().

This is especially hard to debug when using pg with a pool (such as knex.js's), because the error may go unnoticed until the same client object is re-acquired from the pool and used; this is why I opted for single-connection knex / pg examples above.

@sehrope
Copy link
Contributor

sehrope commented Jun 13, 2018

I'm able to reproduce this and I think it's related to the internal client query queue. I haven't totally figured out the cause but I suspect it's from the FEBE messages for the stream.close() being sent after the messages for the next query. If you change the repro code to wait for the completion of the stream.close(...) by giving it a callback and waiting for it's execution then the error does not occur:

// npm install pg pg-query-stream async-main
const pg = require('pg');
const { Client } = pg;
const QueryStream = require('pg-query-stream');
const main = require('async-main').default;

function log(message, ...args) {
    console.log(`%s ${message}`, new Date().toISOString(), ...args);
}

function streamToPromise(stream) {
    return new Promise((resolve, reject) => {
        stream.on('end', () => resolve());
        stream.on('error', reject);
    });
}

function streamClose(stream) {
    return new Promise((resolve) => {
        stream.close(resolve);
    });
}

main(async () => {
    log('Started');
    const client = new Client(process.env.DATABASE_URL);
    try {
        log('Connecting');
        await client.connect();
        log('Connected');

        let stream;
        try {
            const query = new QueryStream('SELECT $1::uuid AS x', ['bad data']);
            stream = client.query(query);
            stream.on('data', (data) => {
                log('Received data: %j', data);
            });
            await streamToPromise(stream);
        } catch (e) {
            log('Error streaming query: %s', e);
        } finally {
            if (stream) {
                log('Closing stream');
                // Replace with `stream.close()` to get the error
                await streamClose(stream);
                log('After close stream');
            }         
        }
        log('Before execute new query');
        const result = await client.query('SELECT 1 AS x');
        log('Result: %j', result.rows);
    } finally {
        await client.end();
    }
});

Output when waiting for stream.close():

2018-06-13T11:48:29.269Z Started
2018-06-13T11:48:29.272Z Connecting
2018-06-13T11:48:29.279Z Connected
2018-06-13T11:48:29.281Z Error streaming query: error: invalid input syntax for uuid: "bad data"
2018-06-13T11:48:29.281Z Closing stream
2018-06-13T11:48:29.324Z After close stream
2018-06-13T11:48:29.324Z Before execute new query
2018-06-13T11:48:29.326Z Result: [{"x":1}]

Replacing the await streamClose(...) with a regular stream.close() gets the same error as you.

There's probably a fix that could be added to the driver to better handle this situation but knex should also be waiting for the close() event prior to continuing.

@mrhwick mrhwick mentioned this issue Jul 17, 2018
@BasixKOR
Copy link

Do we have a progress on this issue? I think I just hit this error and it seems like this issue is abandoned.

@alxndrsn
Copy link
Contributor

alxndrsn commented Sep 30, 2022

Here's a repro for pg v8.8.0 & node v16.13.2 & v18.6.0, postgres 12.12:

const pg = require('pg');
const QueryStream = require('pg-query-stream');

const pool = new pg.Pool({
  poolSize: 1,
  connectionTimeoutMillis: 400,
  statement_timeout: 400,
});

(async () => {
  const query = new QueryStream('SELECT TRUE');
  const client = await pool.connect();
  const stream = await client.query(query);

  try {
    const res = await pool.query('SELECT TRUE');
    throw new Error(`Unexpected res (should have failed): ${JSON.stringify(res, null, 2)}`);
  } catch(err) {
    if(err.message !== 'timeout exceeded when trying to connect') throw new Error(`Unexpected failure: ${err}`);
  }

  await stream.destroy(); // without this, next attempt at pool.query() will hang
  await client.release();

  const res = await pool.query('SELECT TRUE AS val');
  if(res.rows.length !== 1 || res.rows[0].val !== true) {
    throw new Error(`Unexpected res: ${JSON.stringify(res.rows)}`);
  }
})();

Output (node 16.13.2)

~/node_modules/pg/lib/client.js:340
    this.activeQuery.handleRowDescription(msg)
                     ^

TypeError: Cannot read properties of null (reading 'handleRowDescription')
    at Client._handleRowDescription (~/node_modules/pg/lib/client.js:340:22)
    at Connection.emit (node:events:390:28)
    at ~/node_modules/pg/lib/connection.js:114:12
    at Parser.parse (~/node_modules/pg-protocol/dist/parser.js:40:17)
    at Socket.<anonymous> (~/node_modules/pg-protocol/dist/index.js:11:42)
    at Socket.emit (node:events:390:28)
    at addChunk (node:internal/streams/readable:315:12)
    at readableAddChunk (node:internal/streams/readable:289:9)
    at Socket.Readable.push (node:internal/streams/readable:228:10)
    at TCP.onStreamRead (node:internal/stream_base_commons:199:23)

postgres version

# SELECT VERSION();
                                                                version                                                                 
----------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 12.12 (Ubuntu 12.12-0ubuntu0.20.04.1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 9.4.0-1ubuntu1~20.04.1) 9.4.0, 64-bit
(1 row)

Output (node 18.6.0)

~/node_modules/pg/lib/client.js:340
    this.activeQuery.handleRowDescription(msg)
                     ^

TypeError: Cannot read properties of null (reading 'handleRowDescription')
    at Client._handleRowDescription (~/node_modules/pg/lib/client.js:340:22)
    at Connection.emit (node:events:513:28)
    at ~/node_modules/pg/lib/connection.js:114:12
    at Parser.parse (~/node_modules/pg-protocol/dist/parser.js:40:17)
    at Socket.<anonymous> (~/node_modules/pg-protocol/dist/index.js:11:42)
    at Socket.emit (node:events:513:28)
    at addChunk (node:internal/streams/readable:324:12)
    at readableAddChunk (node:internal/streams/readable:297:9)
    at Readable.push (node:internal/streams/readable:234:10)
    at TCP.onStreamRead (node:internal/stream_base_commons:190:23)

Other PostgreSQL versions

I've now tested and recreated this issue with postgres versions:

  • PostgreSQL 14.5 (Debian 14.5-1.pgdg110+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 10.2.1-6) 10.2.1 20210110, 64-bit, node v16.17.1
  • PostgreSQL 13.2 (Debian 13.2-1.pgdg100+1) on x86_64-pc-linux-gnu, compiled by gcc (Debian 8.3.0-6) 8.3.0, 64-bit, node v16.17.1

However, it does NOT occur on:

  • PostgreSQL 13.2 on x86_64-apple-darwin19.6.0, compiled by Apple clang version 12.0.0 (clang-1200.0.32.29), 64-bit, node v16.17.1
  • PostgreSQL 12.1 on x86_64-apple-darwin16.7.0, compiled by Apple LLVM version 8.1.0 (clang-802.0.42), 64-bit, node v16.4.2

@alxndrsn
Copy link
Contributor

Tested with:

  • pg: 8.5.1
  • node: v14.19.3
  • postgres: PostgreSQL 12.10 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 7.3.1 20180712 (Red Hat 7.3.1-12), 64-bit
(node:781) UnhandledPromiseRejectionWarning: Error: Unexpected res: []
    at ~/db-test.js:35:15
    at processTicksAndRejections (internal/process/task_queues.js:95:5)
(Use `node --trace-warnings ...` to show where the warning was created)
(node:781) UnhandledPromiseRejectionWarning: Unhandled promise rejection. This error originated either by throwing inside of an async function without a catch block, or by rejecting a promise which was not handled with .catch(). To terminate the node process on unhandled promise rejection, use the CLI flag `--unhandled-rejections=strict` (see https://nodejs.org/api/cli.html#cli_unhandled_rejections_mode). (rejection id: 1)
(node:781) [DEP0018] DeprecationWarning: Unhandled promise rejections are deprecated. In the future, promise rejections that are not handled will terminate the Node.js process with a non-zero exit code.
~/node_modules/pg/lib/client.js:340
    this.activeQuery.handleRowDescription(msg)
                     ^

TypeError: Cannot read property 'handleRowDescription' of null
    at Client._handleRowDescription (~/node_modules/pg/lib/client.js:340:22)
    at Connection.emit (events.js:400:28)
    at ~/node_modules/pg/lib/connection.js:115:12
    at Parser.parse (~/node_modules/pg-protocol/dist/parser.js:40:17)
    at Socket.<anonymous> (~/node_modules/pg-protocol/dist/index.js:10:42)
    at Socket.emit (events.js:400:28)
    at addChunk (internal/streams/readable.js:293:12)
    at readableAddChunk (internal/streams/readable.js:267:9)
    at Socket.Readable.push (internal/streams/readable.js:206:10)
    at TCP.onStreamRead (internal/stream_base_commons.js:188:23)

@nathanjcochran
Copy link

nathanjcochran commented May 25, 2023

I believe I am hitting this issue as well. In my case, I am trying to cancel a QueryStream query with pg_cancel_backend(), close the stream, and then issue another query on the same connection.

Here's a reproduction (pg: 8.11.0, pg-query-stream: 4.5.0, node 18.9.0, postgres: 13.6):

const pg = require('pg');
const QueryStream = require('pg-query-stream');

(async () => {
  const pool = new pg.Pool();
  pool.on('error', (err) => {
    console.error('Pool error: ', err);
  });

  const conn = await pool.connect();
  conn.on('error', (err) => {
    console.error('Connection error: ', err);
  });

  const result = await conn.query('SELECT pg_backend_pid() AS pid;');
  const { pid } = result.rows[0];

  const stream = conn.query(new QueryStream('SELECT pg_sleep(10);'));
  stream.on('data', (chunk) => {
    console.log('stream data: ', chunk);
  });
  stream.on('error', (err) => {
    console.error('stream error: ', err);
  });

  // Create a promise that is resolved when the stream is closed
  const closed = new Promise((res) => {
    stream.on('close', res);
  });

  // Wait 1 second before cancelling the query
  await new Promise((res) => {
    setTimeout(() => res(), 1000);
  });

  await pool.query('SELECT pg_cancel_backend($1);', [pid]);

  stream.destroy();
  await closed;

  // Triggers uncaught exception
  await conn.query('SELECT pg_sleep(1);');
  
  conn.release()
  await pool.end()
})();

This sometimes results in the following uncaught exception (it appears to be a bit of a race condition):

TypeError: Cannot read properties of null (reading 'handleRowDescription')
    at Client._handleRowDescription (~/node_modules/pg/lib/client.js:362:22)
    at Connection.emit (node:events:513:28)
    at ~/node_modules/pg/lib/connection.js:117:12
    at Parser.parse (~/node_modules/pg-protocol/dist/parser.js:40:17)
    at Socket.<anonymous> (~/node_modules/pg-protocol/dist/index.js:11:42)
    at Socket.emit (node:events:513:28)
    at addChunk (node:internal/streams/readable:324:12)
    at readableAddChunk (node:internal/streams/readable:297:9)
    at Readable.push (node:internal/streams/readable:234:10)
    at TCP.onStreamRead (node:internal/stream_base_commons:190:23)

Waiting for the stream to be closed should theoretically be a reliable signal that the connection is ready to be used again, since the stream isn't closed until a ReadyForQuery message is received - see here and here. However, I believe the ReadyForQuery message is being sent multiple times, which means that even if we wait for it before issuing another query on the connection, it can happen again, which breaks the state machine if another query has already been issued on the connection. Specifically, ReadyForQuery causes the activeQuery to be set to null here. But if we already issued a second query on the connection before that happens, then when the RowDescription message is returned for the second query, it causes the TypeError: Cannot read properties of null (reading 'handleRowDescription') error here, which crashes the process.

I think the multiple ReadyForQuery messages have to do with the fact that I am both destroying the query stream (which I think triggers one Sync), as well as issuing a pg_cancel_backend() call, which causes the connection to error and thereby triggers a second Sync. The race condition has to do with which of those happens first (if the error message happens first, only one Sync is triggered due to this check, but if the error happens second, then two Sync messages are triggered, which causes the issue).

Given that, I believe #2836 (which looks like it would prevent the second Sync from the error message if the Sync from closing the stream had already been fired) should fix the issue for me. Would love to see that get merged!

EDIT: Just tested #2836, and it does indeed fix the issue 🎉.

@lognaturel
Copy link

Consider closing now that #2836 is merged. At least two consumers have confirmed that the issue is fixed.

@brianc
Copy link
Owner

brianc commented Mar 22, 2024

Good point, thanks!

@brianc brianc closed this as completed Mar 22, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

8 participants