2017-10-18 61 views
0

我在調用API進行地理編碼的SSIS作業中有一個c#腳本任務。這個API是專有的,像這樣工作,接收請求,獲取地址字符串,然後嘗試將字符串匹配到一個巨大的地址列表(數百萬),如果它找不到它,然後出去到另一個服務,如谷歌和獲取地理數據信息。如何加快這個c#HttpWebRequest?

正如你所想象的,這個字符串匹配每個請求佔用大量時間。有時它每分鐘的請求速度很慢,我有4M地址需要這樣做。在API方面進行任何開發工作不是一種選擇。爲了讓這裏的過程中更好的畫面是我在做什麼目前:

我拉從數據庫(約4M)地址列表,並把它放在一個DataTable和設置變量:

​​

GetGLFromAddress()這樣的工作:

從上面取變量並形成JSON。使用「POST」和httpWebRequest發送JSON。等待請求(耗時)。退貨請求。用返回設置新變量。使用這些變量更新/插入到數據庫中,然後循環通過原始數據表中的下一行。

理解這個流程很重要,因爲我需要能夠保持每個請求的變量不變,所以我可以更新數據庫中的正確記錄。

這裏是GetGLFromAddress()

private void GetGLFromAddress() 
    { 
     // Request JSON data with Payload 
     var httpWebRequest = (HttpWebRequest)WebRequest.Create("http:"); 
     httpWebRequest.Headers.Add("Authorization", ""); 
     httpWebRequest.ContentType = "application/json"; 
     httpWebRequest.Method = "POST"; 

     using (var streamWriter = new StreamWriter(httpWebRequest.GetRequestStream())) 
     { 
      // this takes the variables from your c# datatable and formats them for json post 
      var jS = new JavaScriptSerializer(); 
      var newJson = jS.Serialize(new SeriesPost() 
      { 
       AddressLine1 = address, 
       City   = city, 
       StateCode = state, 
       CountryCode = country, 
       PostalCode = zip, 
       CreateSiteIfNotFound = true 
      }); 


      //// So you can see the JSON thats output 
      System.Diagnostics.Debug.WriteLine(newJson); 
      streamWriter.Write(newJson); 
      streamWriter.Flush(); 
      streamWriter.Close(); 

     } 

     try 
     { 
      var httpResponse = (HttpWebResponse)httpWebRequest.GetResponse(); 
      using (var streamReader = new StreamReader(httpResponse.GetResponseStream())) 
      { 
       var result = streamReader.ReadToEnd(); 
       // javascript serializer... deserializing the returned json so that way you can set the variables used for insert string 
       var p1 = new JavaScriptSerializer(); 

       // after this line, obj is a fully deserialzed string of json Notice how I reference obj[x].fieldnames below. If you ever want to change the fiels or bring more in 
       // this is how you do it. 
       var obj = p1.Deserialize<List<RootObject>>(result); 

       // you must ensure the values returned are not null before trying to set the variable. You can see when that happens, I'm manually setting the variable value to null. 
       if (string.IsNullOrWhiteSpace(obj[0].MasterSiteId)) 
       { 
        retGLMID = "null"; 
       } 
       else 
       { 
        retGLMID = obj[0].MasterSiteId.ToString(); 
       } 

       if (string.IsNullOrWhiteSpace(obj[0].PrecisionName)) 
       { 
        retAcc = "null"; 
       } 
       else 
       { 
        retAcc = obj[0].PrecisionName.ToString(); 
       } 

       if (string.IsNullOrWhiteSpace(obj[0].PrimaryAddress.AddressLine1Combined)) 
       { 
        retAddress = "null"; 
       } 
       else 
       { 
        retAddress = obj[0].PrimaryAddress.AddressLine1Combined.ToString(); 
       } 

       if (string.IsNullOrWhiteSpace(obj[0].Latitude)) 
       { 
        retLat = "null"; 
       } 
       else 
       { 
        retLat = obj[0].Latitude.ToString(); 
       } 

       if (string.IsNullOrWhiteSpace(obj[0].Longitude)) 
       { 
        retLong = "null"; 
       } 
       else 
       { 
        retLong = obj[0].Longitude.ToString(); 
       } 
       retNewRecord = obj[0].IsNewRecord.ToString(); 

       // Build insert string... notice how I use the recently created variables 
       // string insertStr = retGLMID + ", '" + retAcc + "', '" + retAddress + "', '" + retLat + "', '" + retLong + "', '" + localID; 
       string insertStr = "insert into table  " + 
            "(ID,GLM_ID,NEW_RECORD_IND,ACCURACY) " + 
            " VALUES          " + 
            "('" + localID + "', '" + retGLMID + "', '" + retNewRecord + "', '" + retAcc + "')"; 


       string connectionString = "Data Source=; Initial Catalog=; Trusted_Connection=Yes"; 
       using (SqlConnection connection = new SqlConnection(connectionString)) 
       { 
        SqlCommand cmd = new SqlCommand(insertStr); 
        cmd.CommandText = insertStr; 
        cmd.CommandType = CommandType.Text; 
        cmd.Connection = connection; 
        connection.Open(); 
        cmd.ExecuteNonQuery(); 
        connection.Close(); 
       } 
      } 
     } 

     { 
      string insertStr2 = "insert into table " + 
           "(ID,GLM_ID,NEW_RECORD_IND,ACCURACY) " + 
           " VALUES          " + 
           "('" + localID + "', null, null, 'Not_Found')"; 
      string connectionString2 = "Data Source=; Initial Catalog=; Trusted_Connection=Yes"; 

      using (SqlConnection connection = new SqlConnection(connectionString2)) 
      { 
       SqlCommand cmd = new SqlCommand(insertStr2); 
       cmd.CommandText = insertStr2; 
       cmd.CommandType = CommandType.Text; 
       cmd.Connection = connection; 
       connection.Open(); 
       cmd.ExecuteNonQuery(); 
       connection.Close(); 
      } 
     } 
    } 

當我試圖使用Parallel.Foreach,我曾與變量的問題。我想要運行多個請求,但要保留每個請求的每個變量實例(如果有意義的話)。我無法將API傳遞給API並返回它,否則這將是理想的。

這甚至可能嗎?

如何構建此調用來實現我所追求的內容?

本質上我希望能夠發送多個呼叫,以加快整個過程。

編輯:增加了代碼GetGlFromAddress()。是的,我是一個新手,所以請客氣:)

回答

0

將所有的數據放在一個數組中,一次可以調用多個請求,最好使用多任務或異步方法來調用API。

+0

所以,如果我把所有的請求字符串放在一個數組中,比如說只有2例如A和B.我發送A和B通過,B首先返回。如何確保我更新數據庫中的正確記錄? Async是否有辦法確保以正確的順序返回事物?那麼多線程呢? – user3486773