If you use arrays in PHP, one of the most common tasks you’ll find yourself doing is determining if Item A is in Array X. The function you would probably use in this case is PHP’s in_array.
1
| bool in_array ( mixed $needle , array $haystack [, bool $strict = FALSE ] ) |
This function works great and I recommend sticking to it when it makes sense. However, when you’re dealing with a very large haystack and need to run in_array() on thousands of values, you’ll discover that in_array isn’t particularly fast when cumulated over thousands of calls. Having recently run into this situation, I set up a little experiment to try two different approaches to in_array().
The haystack in my experiment was an array containing 60,000 strings that were 50 characters in length as values.
1
| $arr = array("String1","String2","String3", etc...) |
The needle was a string of 50 characters.
Method A – Using in_array()
1
2
3
4
| if (in_array($needle, $haystack)){ echo("Method A : needle " . $needle . " found in haystack<BR>");} |
Method B – Using isset()
Basically, I reformatted the haystack so that the values of my original array became keys instead and the new value for each key was set to 1.
Basically, I reformatted the haystack so that the values of my original array became keys instead and the new value for each key was set to 1.
1
2
| foreach(array_values($haystack) as $v) $new_haystack[$v] = 1; |
So my haystack became :
1
2
3
4
| $arr["String1"] = 1;$arr["String2"] = 1;$arr["String3"] = 1;etc. |
Then, all you need to do is look up the key:
1
2
3
4
| if (isset($haystack[$needle])){ echo("Method B : needle " . $needle . " found in haystack<BR>");} |
Method C – Using array_intersect()
When all you really need to know is if needle is in haystack, using array_intersect() can also work.
When all you really need to know is if needle is in haystack, using array_intersect() can also work.
1
2
3
4
| if (count(array_intersect(array($needle), $haystack))>0){ echo("Method C : needle " . $needle . " found in haystack<BR>");} |
With these different methods in place, I executed them against the same $haystack and $needle and the results were clear :
1
2
3
| Method A : 0.003180980682373 secondsMethod B : 0.0000109672546 secondsMethod C : 0.045687913894653 seconds |
Method B wins! Keep in mind that this only really becomes interesting with very large data sets. For those of you wondering how long it took to re-arrange the haystack for Method B to use, the answer is0.025528907775879 seconds.
In this experiment, determining if 100,000 strings are or are not in the data set went from 318.098 seconds with in_array() to 1.1222 seconds using isset(). That’s pretty decent.
Hiç yorum yok:
Yorum Gönder