Results from multiple rows in a single field

By , May 16, 2014 19:12

Excuses, excuses
A short one for the first time in a very long time. After starting work for a new Company, I for one, don’t have much time to do any blogging and for another, I don’t do too much on databases any more (unfortunately), other than performance tuning by means of indexing.

Introduction
Today, for the first time in quite a while, I wrote a stored procedure! The requirement was to build an electronic invoice ledger from invoice data in an accounting system. Once I understood the data structure of the accounts, it was simple enough to extract the data that was required. One of the trickier parts was, wanting to keep it “set based”,  listing all invoice attachments in a single column in the final result. This meant having to take values from a column in multiple rows and concatenate them into a single entry. That is simple enough (yet very clunky), when working in a loop, but in a single select it is a bit tricky. Now this has been done often enough and there are plenty of solutions out on the net, in fact I am sure that what I did has been blogged about often enough. So, really just for my records and future reference, here is what I did.

One of the requirements was to get all attachments for an invoice and put them into a single field separated by a single space. Each attachment was a single record in a table. I am going to keep it simple here and only work with a pseudo attachment table and not bother with joining in the invoice data too. It will suffice to say, that the common-key between the invoice and its attachments is the barcode field.

Just do it
The DDL und DML for the test data:

use Work;

if (OBJECT_ID('dbo.attachments') is null)

	create table dbo.attachments (
		id int identity(1,1),
		barcode varchar(22),
		atdate date,
		attachment varchar(20)
		);
else
	truncate table dbo.attachments;

insert into dbo.attachments
	(barcode, atdate, attachment)
values
	('11111111111111111', GETDATE(), '9900'),
	('11111111111111111', GETDATE(), '9800'),
	('11111111111111111', GETDATE(), '9700'),
	('11111111111111111', GETDATE(), '9600'),
	('11111111111111111', GETDATE(), '9500'),
	('11111111111111111', GETDATE(), '9400'),
	('11111111111111111', GETDATE(), '9300'),
	('11111111111111111', GETDATE(), '9200'),
	('11111111111111111', GETDATE(), '9100'),
	('11111111111111111', GETDATE(), '9000'),
	('22222222222222222', GETDATE(), '8900'),
	('22222222222222222', GETDATE(), '8800'),
	('22222222222222222', GETDATE(), '8700'),
	('22222222222222222', GETDATE(), '8600'),
	('22222222222222222', GETDATE(), '8500'),
	('22222222222222222', GETDATE(), '8400'),
	('33333333333333333', GETDATE(), ''), --intentionally left blank to compensate for XML blank
	('33333333333333333', GETDATE(), '7800'),
	('33333333333333333', GETDATE(), '7700'),
	('33333333333333333', GETDATE(), '7600'),
	('33333333333333333', GETDATE(), '7500'),
	('33333333333333333', GETDATE(), '7400'),
	('33333333333333333', GETDATE(), '7300'),
	('33333333333333333', GETDATE(), '7200'),
	('33333333333333333', GETDATE(), '7100'),
	('33333333333333333', GETDATE(), '7000')
	go

Now the extraction DML:

use Work;

select at1.barcode, attachments =
	rtrim((
		select distinct attachment + ' '
		from attachments at2
		where at2.barcode = at1.barcode
		for xml path ('')
	))
from dbo.attachments at1
group by at1.barcode

I used the for xml path clause to get the attachment out of dbo.attachments as a single string. Because I concatenate with a blank (”) the XML node structure is not returned, but rather just all the results for the attachments in a single string of characters. If you don’t concatenate, then the result looks something like this:

Now because we are converting the values to XML the blank value for the attachment (first entry for ) is interpreted as the following:

 ” being an entity reference for the blank space character.

Now this is obviously not desired so that needs to be considered in the sub-select:

use Work;

select at1.barcode, attachments =
	rtrim(
			(
				select distinct attachment + ' '
				from attachments at2
				where at2.barcode = at1.barcode
				and isnull(at2.attachment, '') != ''
				for xml path ('')
			)
	)
from dbo.attachments at1
group by at1.barcode

Here is the final result:

A common requirement is to build a comma separated list. This would be done by concatenating ‘ ,’ before the value being selected and then using substring to remove the first comma.

The DML could look like this:

use Work;

select at1.barcode, attachments =
	substring(
		rtrim(
				(
					select distinct ', ' + attachment
					from attachments at2
					where at2.barcode = at1.barcode
					and isnull(at2.attachment, '') != ''
					for xml path ('')
				)
			),
			3, 1000)
from dbo.attachments at1
group by at1.barcode

And finally
I do hope that this may help someone or other out with solving their problem of getting multiple values into a single field.
Thanks for reading.

Leave a Reply

*

Panorama Theme by Themocracy