-
Improvement
-
Resolution: Fixed
-
Normal
-
None
-
None
Coordinates on the Japanese Wikipedia are written using Japanese characters, rather than the usual degrees, minutes and seconds signs. Some examples:
北緯43度2分39.22秒 東経141度21分9.77秒
南緯22度54分30秒 西経43度11分47秒
北緯35度39分59.81秒東経139度44分29.06秒
The important characters: 度 is degrees, 分 is minutes and 秒 is seconds. 北 is north, 南 is south, 東 is east and 西 is west.
There doesn't have to be any spaces or separators.
I wrote a quick bit of Perl to parse them which might be useful:
use utf8; binmode STDOUT, ":utf8"; my @coords = ( "北緯43度2分39.22秒 東経141度21分9.77秒", "南緯22度54分30秒 西経43度11分47秒", "北緯35度39分59.81秒東経139度44分29.06秒", "北緯35度39分59.81秒 東経139度44分29.06秒", ); for my $coord (@coords) { $coord =~ tr/ .0-9/ .0-9/; # replace fullwidth characters with normal ASCII $coord =~ s/(北|南)緯 *([0-9.]+)度 *([0-9.]+)分 *([0-9.]+)秒 *(東|西)経 *([0-9.]+)度 *([0-9.]+)分 *([0-9.]+)秒/$2° $3' $4" $1, $6° $7' $8" $5/; $coord =~ tr/北南東西/NSEW/; # replace direction characters print "$coord\n"; }
alternatively, the last section without the Unicode characters:
for my $coord (@coords) { $coord =~ tr/\x{3000}\x{FF0E}\x{FF10}-\x{FF19}/ .0-9/; # replace fullwidth characters with normal ASCII $coord =~ s/(\x{5317}|\x{5357})\x{7DEF} *([0-9.]+)\x{5EA6} *([0-9.]+)\x{5206} *([0-9.]+)\x{79D2} *(\x{6771}|\x{897F})\x{7D4C} *([0-9.]+)\x{5EA6} *([0-9.]+)\x{5206} *([0-9.]+)\x{79D2}/$2\x{00B0} $3' $4" $1, $6\x{00B0} $7' $8" $5/; $coord =~ tr/\x{5317}\x{5357}\x{6771}\x{897F}/NSEW/; # replace direction characters print "$coord\n"; }
I haven't actually come across any using fullwidth ASCII characters, but I included stuff to handle that anyway just in case.