Details of using Common Lisp to download&parse VTA.Org's Web pages and locate the tables of data and glean useful data from those tables: http://www.vta.org/ = main page I manually clicked from this to the next page: http://www.vta.org/schedules/schedules_bynumber.html = master listing of schedules, in groups by vehicle/route type This is the WebPage I'm processing first below: (sub-curl-run1 "http://www.vta.org/schedules/schedules_bynumber.html" 1) # 1 -rw------- 1 rem user 317 May 8 01:11 tmp-curl-errout01 (ignore) 50 -rw------- 1 rem user 49793 May 8 01:11 tmp-curl-stdout01 (use this) ;Load that stdout file as list of lines: (length (setq lines (mapcar #'(lambda (str) (string-trim g*whitechars str)) (load-file-by-method (num-make-std-out-filnam 1) :ALLLINES)))) 1555 ;Parse just the HTTP header at the start of that file: (car (setq rec (curl-stdout-lines-parse1 lines))) :DOCUMENT (cadr rec) "text/html" ;Write the main body, the HTML part, out to a new file: (with-open-file (ochan "tmp-vtaorg-1.html" :DIRECTION :OUTPUT :IF-EXISTS :SUPERSEDE) (loop for line in (caddr rec) do (format ochan "~A~%" line))) ;Create a dataflow object and issue directive to parse that HTML file to DOM: (setq na1 (make-nestal :VTATOP)) (nestal+tag+value-install na1 :FILENAME "tmp-vtaorg-1.html") (nestal-dataflow-want na1 :DOM) ;Find where a key piece of text is located within the DOM: (dom+string-find-common-path-inx (nth 1 na1) "Regular Bus Service") (3 3 10 6 4 6) ;Browsing deeper from there, I find the main table listing routes: Path: (nestal+key-find na1 :DOM) Path: (3 3 10 6 4 6 46 3) Next index 8 thru 132 are the flat mix of spacers, titles, GOTOs, routes. (length (setq g*vta-top-rows (subseq (nestal+pathixs-fetch (nestal+key-find na1 :DOM) '(3 3 10 6 4 6 46 3)) 8 133))) ;That global contains all the data I'll need to glean from. ;For example, here's the row at index 8, which is a title row: (setq tit1 (nestal+pathixs-fetch (nestal+key-find na1 :DOM) '(3 3 10 6 4 6 46 3 8))) (:MATCH "tr" (:MATCH "td" (:OPEN "a" (:ATTVAL "class" "no_arrow") (:ATTVAL "name" "100")) (:MATCH "span" (:ATTVAL "class" "blue-title-secondary-red") (:TEXT "Light Rail Service")))) ;And here at index 16 is another title row: (setq tit2 (nestal+pathixs-fetch (nestal+key-find na1 :DOM) '(3 3 10 6 4 6 46 3 16))) (:MATCH "tr" (:MATCH "td" (:OPEN "a" (:ATTVAL "class" "no_arrow") (:ATTVAL "name" "101")) (:MATCH "span" (:ATTVAL "class" "blue-title-secondary-red") (:TEXT "Community Bus Service")))) ;Comparing the two, we see mostly boilerplate, with a few differences: (constrees-compare-build-nestal tit1 tit2) (:DIFFS (2 "Light Rail Service" "Community Bus Service") (1 "100" "101")) ;Using that report, it's easy to edit either of the pieces to turn it ; into a template for recognizing this class of rows and gleaning the ; variable parts from it: (defparameter g*vta-template-title '(:MATCH "tr" (:MATCH "td" (:OPEN "a" (:ATTVAL "class" "no_arrow") (:ATTVAL "name" :LABEL)) (:MATCH "span" (:ATTVAL "class" "blue-title-secondary-red") (:TEXT :TITLE-STRING))))) ;Now we actually use that as a template: (constree+template-compare-build-nestal tit2 g*vta-template-title) (:VARPARTS (:TITLE-STRING . "Community Bus Service") (:LABEL . "101")) ;Now we build that template into a function to recognize such rows: (defun vta-recognize-title1 (row) (constree+template-compare-build-nestal row g*vta-template-title (make-nestal :TITLE1))) ;Now we test that recognizer by applying it to each of the rows in the table: (mapcar #'vta-recognize-title1 g*vta-top-rows) ((:TITLE1 (:TITLE-STRING . "Light Rail Service") (:LABEL . "100")) NIL NIL NIL NIL NIL NIL NIL (:TITLE1 (:TITLE-STRING . "Community Bus Service") (:LABEL . "101")) NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL (:TITLE1 (:TITLE-STRING . "Regular Bus Service") (:LABEL . "102")) NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL (:TITLE1 (:TITLE-STRING . "Express Bus Service") (:LABEL . "103")) NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL (:TITLE1 (:TITLE-STRING . "Limited Stop Bus Service") (:LABEL . "104")) NIL NIL NIL NIL NIL NIL NIL (:TITLE1 (:TITLE-STRING . "Bus Rapid Transit Service") (:LABEL . "105")) NIL NIL NIL NIL (:TITLE1 (:TITLE-STRING . "Light Rail Shuttle") (:LABEL . "106")) NIL NIL NIL NIL NIL NIL NIL (:TITLE1 (:TITLE-STRING . "Altamont Commuter Express (ACE) Shuttles") (:LABEL . "107")) NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL (:TITLE1 (:TITLE-STRING . "Regional Services") (:LABEL . "108")) NIL NIL NIL NIL NIL NIL) ;Note the NIL items show rows in the table that don't match that template, ; i.e. the other types of rows: spacers, gotos, and bus/LR route numbers. ;Now we do the same sort of thing for the rows that are bus/LR routes. ;One complication is that there are two different formats of LR rows, ; so we need two different templates. Fortunately each of these two ; templates matches some of the regular bus routes too: (setq lr1 (nestal+pathixs-fetch (nestal+key-find na1 :DOM) '(3 3 10 6 4 6 46 3 9))) (:MATCH "tr" (:MATCH "td" (:ATTVAL "class" "ZSMFlistDk") (:MATCH "a" (:ATTVAL "href" "/schedules/SC_901.html") (:TEXT "   ") (:MATCH "strong" (:TEXT "901")) (:TEXT "   Alum Rock to Santa Teresa")))) (setq lr3 (nestal+pathixs-fetch (nestal+key-find na1 :DOM) '(3 3 10 6 4 6 46 3 11))) (:MATCH "tr" (:MATCH "td" (:ATTVAL "class" "ZSMFlistDk") (:MATCH "a" (:ATTVAL "href" "/schedules/SC_900.html") (:TEXT "   ") (:MATCH "strong" (:TEXT "900")) (:TEXT "   Ohlone/Chynoweth to Almaden")))) (setq bus22 (nestal+pathixs-fetch (nestal+key-find na1 :DOM) '(3 3 10 6 4 6 46 3 41))) (:MATCH "tr" (:MATCH "td" (:ATTVAL "class" "ZSMFlistDk") (:MATCH "a" (:ATTVAL "href" "/schedules/SC_22.html") (:TEXT "   ") (:MATCH "strong" (:TEXT "22")) (:TEXT "   Eastridge Transit Center to Palo Alto Transit Center via El Camino")))) ;Comparing any two of those to see which parts are variable: (constrees-compare-build-nestal lr1 lr3) (:DIFFS (3 "   Alum Rock to Santa Teresa" "   Ohlone/Chynoweth to Almaden") (2 "901" "900") (1 "/schedules/SC_901.html" "/schedules/SC_900.html")) ;Editing one of the rows to create a template: (defparameter g*vta-template-lrbus1 '(:MATCH "tr" (:MATCH "td" (:ATTVAL "class" "ZSMFlistDk") (:MATCH "a" (:ATTVAL "href" :ROUTE-URL) (:TEXT "   ") (:MATCH "strong" (:TEXT :ROUTE-NUM)) (:TEXT :ROUTE-STRING))))) ;Using that template to recognize one of the rows: (constree+template-compare-build-nestal bus22 g*vta-template-lrbus1) (:VARPARTS (:ROUTE-STRING . "   Eastridge Transit Center to Palo Alto Transit Center via El Camino") (:ROUTE-NUM . "22") (:ROUTE-URL . "/schedules/SC_22.html")) ;Now incorporate that template into a function to recognize this class of row: (defun vta-recognize-lrbus1 (row) (constree+template-compare-build-nestal row g*vta-template-lrbus1 (make-nestal :LRBUS1))) ;Now test that recognizer by applying it to *each* the row in the table: (mapcar #'vta-recognize-lrbus1 g*vta-top-rows) (NIL (:LRBUS1 (:ROUTE-STRING . "   Alum Rock to Santa Teresa") (:ROUTE-NUM . "901") (:ROUTE-URL . "/schedules/SC_901.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Ohlone/Chynoweth to Almaden") (:ROUTE-NUM . "900") (:ROUTE-URL . "/schedules/SC_900.html")) NIL NIL NIL NIL NIL (:LRBUS1 (:ROUTE-STRING . "   San Jose Market Center to Hedding & 15th") (:ROUTE-NUM . "11") (:ROUTE-URL . "/schedules/SC_11.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Gilroy Transit Center to St Louise Hospital") (:ROUTE-NUM . "14") (:ROUTE-URL . "/schedules/SC_14.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Gilroy Transit Center to Murray and Tomkins") (:ROUTE-NUM . "17") (:ROUTE-URL . "/schedules/SC_17.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Gilroy Transit Center via Wren & Mantelli") (:ROUTE-NUM . "19") (:ROUTE-URL . "/schedules/SC_19.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   San Antonio Shopping Center to Downtown Mt. View Transit Center") (:ROUTE-NUM . "34") (:ROUTE-URL . "/schedules/SC_34.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   The Villages to Eastridge Transit Center") (:ROUTE-NUM . "39") (:ROUTE-URL . "/schedules/SC_39.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Alum Rock Transit Center to Penitencia Creek Transit Center") (:ROUTE-NUM . "45") (:ROUTE-URL . "/schedules/SC_45.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Winchester Transit Center to Downtown Los Gatos via Los Gatos Blvd") (:ROUTE-NUM . "49") (:ROUTE-URL . "/schedules/SC_49.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Palo Alto Veteran's Hospital to Middlefield and Colorado") (:ROUTE-NUM . "88") (:ROUTE-URL . "/schedules/SC_88.html")) NIL NIL NIL NIL NIL (:LRBUS1 (:ROUTE-STRING . "   Airport Flyer: Santa Clara Transit Center to Metro-Airport LRT Station") (:ROUTE-NUM . "10") (:ROUTE-URL . "/schedules/SC_10.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Eastridge Transit Center to Palo Alto Transit Center via El Camino") (:ROUTE-NUM . "22") (:ROUTE-URL . "/schedules/SC_22.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Alum Rock Transit Center to De Anza College via Valley Medical Center") (:ROUTE-NUM . "25") (:ROUTE-URL . "/schedules/SC_25.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Kaiser San Jose to Good Samaritan Hospital") (:ROUTE-NUM . "27") (:ROUTE-URL . "/schedules/SC_27.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Stanford Shopping Center to Downtown Mt. View Transit Center") (:ROUTE-NUM . "35") (:ROUTE-URL . "/schedules/SC_35.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Monterey & Senter to Santa Teresa LRT Station") (:ROUTE-NUM . "42") (:ROUTE-URL . "/schedules/SC_42.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Great Mall/Main Transit Center to McCarthy Ranch") (:ROUTE-NUM . "47") (:ROUTE-URL . "/schedules/SC_47.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Downtown Mountain View Transit Center to Foothill College") (:ROUTE-NUM . "52") (:ROUTE-URL . "/schedules/SC_52.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   De Anza College to Sunnyvale/Lockheed Martin Transit Center") (:ROUTE-NUM . "54") (:ROUTE-URL . "/schedules/SC_54.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   West Valley College to Great America via Quito Rd") (:ROUTE-NUM . "57") (:ROUTE-URL . "/schedules/SC_57.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Winchester Transit Center to Great America") (:ROUTE-NUM . "60") (:ROUTE-URL . "/schedules/SC_60.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Good Samaritan Hospital to Sierra & Piedmont") (:ROUTE-NUM . "62") (:ROUTE-URL . "/schedules/SC_62.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Almaden LRT Station via Downtown San Jose to McKee & White") (:ROUTE-NUM . "64") (:ROUTE-URL . "/schedules/SC_64.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Gilroy Transit Center to San Jose Diridon Transit Center") (:ROUTE-NUM . "68") (:ROUTE-URL . "/schedules/SC_68.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Eastridge Transit Center to Great Mall/Main Transit Center") (:ROUTE-NUM . "71") (:ROUTE-URL . "/schedules/SC_71.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Snell & Capitol to Downtown San Jose") (:ROUTE-NUM . "73") (:ROUTE-URL . "/schedules/SC_73.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Eastridge Transit Center to Great Mall/ Main Transit Center") (:ROUTE-NUM . "77") (:ROUTE-URL . "/schedules/SC_77.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Westgate to Downtown San Jose") (:ROUTE-NUM . "82") (:ROUTE-URL . "/schedules/SC_82.html")) NIL NIL NIL NIL (:LRBUS1 (:ROUTE-STRING . "   Camden & Highway 85 to Palo Alto") (:ROUTE-NUM . "101") (:ROUTE-URL . "/schedules/SC_101.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Eastridge Transit Center to Palo Alto") (:ROUTE-NUM . "103") (:ROUTE-URL . "/schedules/SC_103.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Fremont BART to Lockheed Martin Transit Center/Moffett Industrial Park") (:ROUTE-NUM . "120") (:ROUTE-URL . "/schedules/SC_120.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   South San Jose to Lockheed Martin Transit Center/Moffett Industrial Park") (:ROUTE-NUM . "122") (:ROUTE-URL . "/schedules/SC_122.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Gilroy Transit Center to San Jose Diridon Transit Center") (:ROUTE-NUM . "168") (:ROUTE-URL . "/schedules/SC_168.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Fremont BART to San Jose Diridon Transit Center") (:ROUTE-NUM . "181") (:ROUTE-URL . "/schedules/SC_181.html")) NIL NIL NIL NIL NIL (:LRBUS1 (:ROUTE-STRING . "   South San Jose to Sunnyvale Transit Center via Arques") (:ROUTE-NUM . "304") (:ROUTE-URL . "/schedules/SC_304.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Almaden Expwy. & Camden to Lockheed Martin Transit Center/Moffett Industrial Park") (:ROUTE-NUM . "328") (:ROUTE-URL . "/schedules/SC_328.html")) NIL NIL NIL NIL NIL (:LRBUS1 (:ROUTE-STRING . "   Eastridge Transit Center to Palo Alto Transit Center") (:ROUTE-NUM . "522") (:ROUTE-URL . "/schedules/SC_522.html")) NIL NIL NIL NIL (:LRBUS1 (:ROUTE-STRING . "   Downtown Area Shuttle (DASH)") (:ROUTE-NUM . "201") (:ROUTE-URL . "/schedules/SC_201.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Hitachi Shuttle") (:ROUTE-NUM . "805") (:ROUTE-URL . "/schedules/SC_805.html")) NIL NIL NIL NIL NIL (:LRBUS1 (:ROUTE-STRING . "   Gray Line - South Sunnyvale") (:ROUTE-NUM . "822") (:ROUTE-URL . "/schedules/SC_822.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Orange Line - Mountain View/Palo Alto") (:ROUTE-NUM . "824") (:ROUTE-URL . "/schedules/SC_824.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Red Line - North Sunnyvale") (:ROUTE-NUM . "826") (:ROUTE-URL . "/schedules/SC_826.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Brown Line - North San Jose") (:ROUTE-NUM . "828") (:ROUTE-URL . "/schedules/SC_828.html")) NIL NIL NIL NIL NIL (:LRBUS1 (:ROUTE-STRING . "   Highway 17 Express") (:ROUTE-NUM . "970") (:ROUTE-URL . "/schedules/SC_970.html")) NIL (:LRBUS1 (:ROUTE-STRING . "   Line 55 Monterey - San Jose Express") (:ROUTE-NUM . "972") (:ROUTE-URL . "/schedules/SC_972.html")) NIL NIL NIL) ;Next, that other format of row for light-rail routes: (setq lr2 (nestal+pathixs-fetch (nestal+key-find na1 :DOM) '(3 3 10 6 4 6 46 3 10))) (:MATCH "tr" (:MATCH "td" (:MATCH "a" (:ATTVAL "href" "/schedules/SC_902.html") (:TEXT "   ") (:MATCH "strong" (:TEXT "902")) (:TEXT "   Mountain View to Winchester")))) (setq lr4 (nestal+pathixs-fetch (nestal+key-find na1 :DOM) '(3 3 10 6 4 6 46 3 12))) (:MATCH "tr" (:MATCH "td" (:MATCH "a" (:ATTVAL "href" "/services/trolleys/historic_trolleys.html") (:TEXT "   ") (:MATCH "strong" (:TEXT "920")) (:TEXT "   Historic Trolley")))) (constrees-compare-build-nestal lr2 lr4) (:DIFFS (3 "   Mountain View to Winchester" "   Historic Trolley") (2 "902" "920") (1 "/schedules/SC_902.html" "/services/trolleys/historic_trolleys.html")) (defparameter g*vta-template-lrbus2 '(:MATCH "tr" (:MATCH "td" (:MATCH "a" (:ATTVAL "href" :ROUTE-URL) (:TEXT "   ") (:MATCH "strong" (:TEXT :ROUTE-NUM)) (:TEXT :ROUTE-STRING))))) (defun vta-recognize-lrbus2 (row) (constree+template-compare-build-nestal row g*vta-template-lrbus2 (make-nestal :LRBUS2))) (mapcar #'vta-recognize-lrbus2 g*vta-top-rows) (NIL NIL (:LRBUS2 (:ROUTE-STRING . "   Mountain View to Winchester") (:ROUTE-NUM . "902") (:ROUTE-URL . "/schedules/SC_902.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Historic Trolley") (:ROUTE-NUM . "920") (:ROUTE-URL . "/services/trolleys/historic_trolleys.html")) NIL NIL NIL NIL NIL (:LRBUS2 (:ROUTE-STRING . "   Almaden & McKean to Ohlone/Chynoweth LRT Station") (:ROUTE-NUM . "13") (:ROUTE-URL . "/schedules/SC_13.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Burnett Ave. to Morgan Hill Civic Center") (:ROUTE-NUM . "16") (:ROUTE-URL . "/schedules/SC_16.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Gilroy Transit Center via Gavilan College") (:ROUTE-NUM . "18") (:ROUTE-URL . "/schedules/SC_18.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Santa Clara Transit Center to San Antonio Shopping Center") (:ROUTE-NUM . "32") (:ROUTE-URL . "/schedules/SC_32.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Capitol Light Rail Station to West Valley College") (:ROUTE-NUM . "37") (:ROUTE-URL . "/schedules/SC_37.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Monterey & Senter to Santa Teresa LRT Station") (:ROUTE-NUM . "42") (:ROUTE-URL . "/schedules/SC_42.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Winchester Transit Center to Downtown Los Gatos via Winchester") (:ROUTE-NUM . "48") (:ROUTE-URL . "/schedules/SC_48.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Kooser & Meridian to Downtown San Jose") (:ROUTE-NUM . "65") (:ROUTE-URL . "/schedules/SC_65.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Palo Alto Veteran's Hospital to California Ave. Caltrain Station") (:ROUTE-NUM . "89") (:ROUTE-URL . "/schedules/SC_89.html")) NIL NIL NIL NIL NIL (:LRBUS2 (:ROUTE-STRING . "   Eastridge Transit Center to San Jose Civic Center via San Jose Flea Market") (:ROUTE-NUM . "12") (:ROUTE-URL . "/schedules/SC_12.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Alum Rock Transit Center to De Anza College via Stevens Creek") (:ROUTE-NUM . "23") (:ROUTE-URL . "/schedules/SC_23.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Eastridge Transit Center to Sunnyvale-Lockheed Martin Transit Center") (:ROUTE-NUM . "26") (:ROUTE-URL . "/schedules/SC_26.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Eastridge Transit Center to Evergreen Valley College") (:ROUTE-NUM . "31") (:ROUTE-URL . "/schedules/SC_31.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   La Avenida & Shoreline to Foothill College") (:ROUTE-NUM . "40") (:ROUTE-URL . "/schedules/SC_40.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Great Mall/Main Transit Center to Washington & Escuela via Yellowstone") (:ROUTE-NUM . "46") (:ROUTE-URL . "/schedules/SC_46.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   De Anza College to Moffett Field/Ames Center") (:ROUTE-NUM . "51") (:ROUTE-URL . "/schedules/SC_51.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   West Valley College to Sunnyvale Transit Center") (:ROUTE-NUM . "53") (:ROUTE-URL . "/schedules/SC_53.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   De Anza College to Great America") (:ROUTE-NUM . "55") (:ROUTE-URL . "/schedules/SC_55.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   West Valley College to Alviso via Fruitvale") (:ROUTE-NUM . "58") (:ROUTE-URL . "/schedules/SC_58.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Good Samaritan Hospital to Sierra & Piedmont") (:ROUTE-NUM . "61") (:ROUTE-URL . "/schedules/SC_61.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Almaden Expwy. & Camden to San Jose State University") (:ROUTE-NUM . "63") (:ROUTE-URL . "/schedules/SC_63.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Kaiser San Jose to Milpitas/Dixon Rd") (:ROUTE-NUM . "66") (:ROUTE-URL . "/schedules/SC_66.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Capitol LRT Station to Great Mall/Main Transit Center") (:ROUTE-NUM . "70") (:ROUTE-URL . "/schedules/SC_70.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Senter & Monterey to Downtown San Jose") (:ROUTE-NUM . "72") (:ROUTE-URL . "/schedules/SC_72.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Los Gatos to Summit Road") (:ROUTE-NUM . "76") (:ROUTE-URL . "/schedules/SC_76.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   San Jose State University to Cupertino Square") (:ROUTE-NUM . "81") (:ROUTE-URL . "/schedules/SC_81.html")) NIL NIL NIL NIL NIL NIL (:LRBUS2 (:ROUTE-STRING . "   South San Jose to Palo Alto") (:ROUTE-NUM . "102") (:ROUTE-URL . "/schedules/SC_102.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Penitencia Creek Transit Center to Palo Alto") (:ROUTE-NUM . "104") (:ROUTE-URL . "/schedules/SC_104.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Gilroy Transit Center to Lockheed Martin Transit Center/Moffett Industrial Park") (:ROUTE-NUM . "121") (:ROUTE-URL . "/schedules/SC_121.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Fremont BART to Mission College & Montague Expwy.") (:ROUTE-NUM . "140") (:ROUTE-URL . "/schedules/SC_140.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Fremont BART to Great Mall/Main Transit Center/San Jose Diridon Transit Center") (:ROUTE-NUM . "180") (:ROUTE-URL . "/schedules/SC_180.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Palo Alto to IBM/Bailey Ave") (:ROUTE-NUM . "182") (:ROUTE-URL . "/schedules/SC_182.html")) NIL NIL NIL NIL NIL (:LRBUS2 (:ROUTE-STRING . "   Great Mall/Main Transit Center to Lockheed Martin/Moffett Park") (:ROUTE-NUM . "321") (:ROUTE-URL . "/schedules/SC_321.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Almaden Expwy. & Camden to Tasman Drive") (:ROUTE-NUM . "330") (:ROUTE-URL . "/schedules/SC_330.html")) NIL NIL NIL NIL NIL NIL NIL NIL NIL NIL (:LRBUS2 (:ROUTE-STRING . "   River Oaks Shuttle") (:ROUTE-NUM . "203") (:ROUTE-URL . "/schedules/SC_203.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   IBM Shuttle") (:ROUTE-NUM . "806") (:ROUTE-URL . "/schedules/SC_806.html")) NIL NIL NIL NIL NIL (:LRBUS2 (:ROUTE-STRING . "   Green Line - North Santa Clara") (:ROUTE-NUM . "823") (:ROUTE-URL . "/schedules/SC_823.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Purple Line - West Milpitas") (:ROUTE-NUM . "825") (:ROUTE-URL . "/schedules/SC_825.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Yellow Line - South Santa Clara") (:ROUTE-NUM . "827") (:ROUTE-URL . "/schedules/SC_827.html")) NIL (:LRBUS2 (:ROUTE-STRING . "   Violet Line - East Milpitas") (:ROUTE-NUM . "831") (:ROUTE-URL . "/schedules/SC_831.html")) NIL NIL NIL NIL NIL (:LRBUS2 (:ROUTE-STRING . "   Dumbarton Express") (:ROUTE-NUM . "971") (:ROUTE-URL . "/schedules/SC_971.html")) NIL NIL NIL NIL) ;I can mostly ignore the spacer rows and the GOTO rows. ;In fact, to alow the check-for-completeness below, I did go ahead to ; define recognizers for spacer and GOTO rows, which are used the next day. ;That completes the main "difficult part" of gleaning information from ; that main table, to obtain the route numbers, text descriptions, ; and URLs, for each light-rail and bus route. Update 2010.May.09 details: ;Finished writing recognizers for each type of row, per corresponding template, ; then wrote: (defun vta-recognize-alltop (row) (or (vta-recognize-title1 row) (vta-recognize-lrbus1 row) (vta-recognize-lrbus2 row) (vta-recognize-space1 row) (vta-recognize-goto1 row))) ;Then tested that against the complete list of rows in that table: (length (setq g-ress (mapcar #'vta-recognize-alltop g*vta-top-rows))) ;=> 125 (mapcar #'car g-ress) (:TITLE1 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :SPACE1 :GOTO1 :SPACE1 :TITLE1 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :SPACE1 :GOTO1 :SPACE1 :TITLE1 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :SPACE1 :GOTO1 :SPACE1 :TITLE1 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :SPACE1 :GOTO1 :SPACE1 :TITLE1 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :SPACE1 :GOTO1 :SPACE1 :TITLE1 :LRBUS1 :SPACE1 :GOTO1 :SPACE1 :TITLE1 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :SPACE1 :GOTO1 :SPACE1 :TITLE1 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :LRBUS1 :LRBUS2 :SPACE1 :GOTO1 :SPACE1 :TITLE1 :LRBUS1 :LRBUS2 :LRBUS1 :SPACE1 :GOTO1 :SPACE1) ;A quick glance doesn't show any NIL entries, but to be safe: (position NIL *) ;=> NIL ;Confirmed: Each row of the table matched one of the recogizers. ;Next I need to merge the above two kinds of route-number rows, and the ; title rows that precede each group of route-number rows, then break ; the groups of route-number rows that follow each title row. What I'l ; do is start from the g-ress computed just above. Just a sample of ; what some of it really looks like: (subseq g-ress 0 6) ((:TITLE1 (:TITLE-STRING . "Light Rail Service") (:LABEL . "100")) (:LRBUS1 (:ROUTE-STRING . "   Alum Rock to Santa Teresa") (:ROUTE-NUM . "901") (:ROUTE-URL . "/schedules/SC_901.html")) (:LRBUS2 (:ROUTE-STRING . "   Mountain View to Winchester") (:ROUTE-NUM . "902") (:ROUTE-URL . "/schedules/SC_902.html")) (:LRBUS1 (:ROUTE-STRING . "   Ohlone/Chynoweth to Almaden") (:ROUTE-NUM . "900") (:ROUTE-URL . "/schedules/SC_900.html")) (:LRBUS2 (:ROUTE-STRING . "   Historic Trolley") (:ROUTE-NUM . "920") (:ROUTE-URL . "/services/trolleys/historic_trolleys.html")) (:SPACE1)) ;Another sample further along in the table: (subseq g-ress 26 34) ((:LRBUS2 (:ROUTE-STRING . "   Palo Alto Veteran's Hospital to California Ave. Caltrain Station") (:ROUTE-NUM . "89") (:ROUTE-URL . "/schedules/SC_89.html")) (:SPACE1) (:GOTO1) (:SPACE1) (:TITLE1 (:TITLE-STRING . "Regular Bus Service") (:LABEL . "102")) (:LRBUS1 (:ROUTE-STRING . "   Airport Flyer: Santa Clara Transit Center to Metro-Airport LRT Station") (:ROUTE-NUM . "10") (:ROUTE-URL . "/schedules/SC_10.html")) (:LRBUS2 (:ROUTE-STRING . "   Eastridge Transit Center to San Jose Civic Center via San Jose Flea Market") (:ROUTE-NUM . "12") (:ROUTE-URL . "/schedules/SC_12.html")) (:LRBUS1 (:ROUTE-STRING . "   Eastridge Transit Center to Palo Alto Transit Center via El Camino") (:ROUTE-NUM . "22") (:ROUTE-URL . "/schedules/SC_22.html"))) ;Starting work on what I said above (splitting segments per where title lines ; appear, then further combining, ...) ;Wait until later when that is done: ; Then I'll have the information I need to generate the first two menus, ; namely the classes/groups of routes, and then within each group a list of ; the actual routes, for the user to select the desired route number.