Nearest neighbor - Joining two tables

Hello there! :)

Just want to say, I'm still very very new to messing around with geospatial data - The reason for doing it, is to make a better analysis for my master thesis - So please be patient if you can :)

I have two tables: DAWA and Node

DAWA contains: 800.000 entries

  • ConCatAddress = varchar(255)
  • wgs84koordinat_long = float
  • wgs84_koordinat_lat = float
  • Geog = ([GEOGRAPHY]::STGeomFromText(((('Point('+CONVERT([varchar](32),[wgs84koordinat_long]))+' ')+CONVERT([varchar](32),[wgs84koordinat_lat]))+')',(4326)))

Node contains: 900 entries

  • NodeID = int
  • wgs84_long = float
  • wgs84_lat = float
  • Geog ([GEOGRAPHY]::STGeomFromText(((('Point('+CONVERT([varchar](32),[wgs84_long]))+' ')+CONVERT([varchar](32),[wgs84_lat]))+')',(4326)))

What I need to to, is to find the nearest point from an address in DAWA with an entry from Node and Update DAWA with a column called "NearestPoint" and "distance" with the finding.

Right now, I'm using the SQL in the bottom of this post, but here I have to run i manually 900 times, which most likely will get the same address more than once and will take me forever. As you can see, the SQL below will answer my question: Which addresses is nearest to this point.

What I want it to answer is: Which point, from Node, is nearest to this specific address in DAWA.

SQL:

DECLARE @Lat AS float, @Long AS float, @adresse AS varchar(255);
DECLARE @MaxRange AS int, @MinRange AS int, @RangeDivisor AS float, @RangeUnit AS varchar(10) , @RangeDelimiter AS varchar(2);
SET @MinRange = 0;
SET @MaxRange = 10;
SET @RangeDivisor = 1000;
SET @RangeDelimiter = 4;
SET @RangeUnit = 'KM';

SELECT @Lat = dbo.DAWA.wgs84koordinat_lat, @Long = dbo.DAWA.wgs84koordinat_long, @adresse = dbo.DAWA.ConCatAddress
FROM dbo.DAWA --WHERE postnr = '6700';
WHERE wgs84koordinat_long = '10.1894185761792' AND wgs84koordinat_lat = '56.1940385664007'; --FiberNode lokation (NodeID='8' Adr='8200 Aarhus N Skejbyparken 4 1 1')

WITH Distance AS (
SELECT
  KVHx,
  wgs84koordinat_lat,
  wgs84koordinat_long,
  ConCatAddress,
  Geog.STDistance(geography::Point(@Lat, @Long, 4326))/@RangeDivisor [Dist],
  @RangeUnit [Unit]
FROM dbo.DAWA)

SELECT KVHx,  ConCatAddress AS [Address], wgs84koordinat_lat AS latgrad, wgs84koordinat_long AS longgrad, LEFT(Dist,@RangeDelimiter) AS Dist, Unit
FROM Distance
WHERE Dist <= @MaxRange AND Dist >= @MinRange 
ORDER BY Dist

Can any of you help me? I've tried to make some JOINS, but I can't make it how I want.. ! :(

I've found a picture that maybe explaining it better than I:

Picture showing how I want the end result

August 27th, 2015 8:43am

Hi Patrick

So usual approach to do this is to take the cross product of the two tables and calculate the distance between each set of point and then to rank them. I have a simple example using Geometry (it easier to use than Geography if you do not already have points) .

Begin with some sample data

CREATE TABLE #DAWA
(
	ConcatAddress VARCHAR(255) PRIMARY KEY,
	Geog GEOMETRY,
	NearestNeighborNodeId INT,
	NearestNeighborNodeDistance FLOAT
)

create Table #Node
(
	NodeId INT IDENTITY(1,1),
	Geog GEOMETRY,
)

INSERT INTO #DAWA(ConcatAddress,Geog) VALUES
('Adress 1',GEOMETRY::STGeomFromText('POINT (1 7)', 0)),
('Adress 2',GEOMETRY::STGeomFromText('POINT (9 2)', 0)),
('Adress 3',GEOMETRY::STGeomFromText('POINT (10 0)', 0)),
('Adress 4',GEOMETRY::STGeomFromText('POINT (5 5)', 0))

INSERT INTO #node(Geog) VALUES
(GEOMETRY::STGeomFromText('POINT (6 2)', 0)),
(GEOMETRY::STGeomFromText('POINT (4 5)', 0)),
(GEOMETRY::STGeomFromText('POINT (8 7)', 0));

Next we use a cross join to create the cross product, then using the ROW_NUMBER() OVER function to rank the distance (STDistance) between each set of points in your two tables, then update the Original DAWA table with the closest one which is where your rank is = 1

;WITH NearestNeighbor AS
(
	SELECT 
		ROW_NUMBER() OVER (PARTITION BY D.ConcatAddress ORDER BY D.Geog.STDistance(N.Geog) ) [NeighborRank],
		D.ConcatAddress,
		N.NodeId [NearestNeighborNodeId],
		D.Geog.STDistance(N.Geog) [Distance]
	FROM #DAWA D
	CROSS JOIN #Node N
)

UPDATE #DAWA
SET	NearestNeighborNodeId = NN.[NearestNeighborNodeId],
	NearestNeighborNodeDistance = nn.[Distance]
FROM #DAWA D inner join NearestNeighbor NN on NN.ConcatAddress = D.ConcatAddress
WHERE [NeighborRank] = 1;

The DAWA table has now been updated with the nearest NodeId as well as the distance between those nodes.

SELECT * FROM #DAWA;

DROP TABLE #DAWA;
DROP TABLE #Node;
This should work just fine with reasonably size loads but the cross product will become slow as you begin to add thousands or millions of points into the table, t that point you would want to look at spatial indexes 

Free Windows Admin Tool Kit Click here and download it now
August 28th, 2015 1:38am

This topic is archived. No further replies will be accepted.

Other recent topics Other recent topics