Is there an easy way to delete an element from an array using PHP, such that foreach ($array) no longer includes that element? I thought that setting it to null would do it, but apparently it does not work.
The Good: Do include a small example DataFrame, either as runnable code: In [1]: df = pd.DataFrame([[1, 2], [1, 3], [4, 6]], columns=['A', 'B']) or make it "copy and pasteable" using pd.read_clipboard(sep=r'\s\s+'). In [2]: df Out[2]: A B 0 1 2 1 1 3 2 4 6 Test it yourself to make sure it works andRead more
The Good:
- Do include a small example DataFrame, either as runnable code:
In [1]: df = pd.DataFrame([[1, 2], [1, 3], [4, 6]], columns=['A', 'B'])
or make it “copy and pasteable” using
pd.read_clipboard(sep=r'\s\s+')
.In [2]: df Out[2]: A B 0 1 2 1 1 3 2 4 6
Test it yourself to make sure it works and reproduces the issue.
- You can format the text for Stack Overflow by highlighting and using Ctrl+K (or prepend four spaces to each line), or place three backticks (“`) above and below your code with your code unindented.
- I really do mean small. The vast majority of example DataFrames could be fewer than 6 rows,[citation needed] and I bet I can do it in 5. Can you reproduce the error with
df = df.head()
? If not, fiddle around to see if you can make up a small DataFrame which exhibits the issue you are facing.But every rule has an exception, the obvious one being for performance issues (in which case definitely use
%timeit
and possibly%prun
to profile your code), where you should generate:df = pd.DataFrame(np.random.randn(100000000, 10))
Consider using
np.random.seed
so we have the exact same frame. Having said that, “make this code fast for me” is not strictly on topic for the site. - For getting runnable code,
df.to_dict
is often useful, with the differentorient
options for different cases. In the example above, I could have grabbed the data and columns fromdf.to_dict('split')
.
- Write out the outcome you desire (similarly to above)
In [3]: iwantthis Out[3]: A B 0 1 5 1 4 6
Explain where the numbers come from:
The 5 is the sum of the B column for the rows where A is 1.
- Do show the code you’ve tried:
In [4]: df.groupby('A').sum() Out[4]: B A 1 5 4 6
But say what’s incorrect:
The A column is in the index rather than a column.
- Do show you’ve done some research (search the documentation, search Stack Overflow), and give a summary:
The docstring for sum simply states “Compute sum of group values”
The groupby documentation doesn’t give any examples for this.
Aside: the answer here is to use
df.groupby('A', as_index=False).sum()
. - If it’s relevant that you have Timestamp columns, e.g. you’re resampling or something, then be explicit and apply
pd.to_datetime
to them for good measure.df['date'] = pd.to_datetime(df['date']) # this column ought to be date.
Sometimes this is the issue itself: they were strings.
The Bad:
- Don’t include a MultiIndex, which we can’t copy and paste (see above). This is kind of a grievance with Pandas’ default display, but nonetheless annoying:
In [11]: df Out[11]: C A B 1 2 3 2 6
The correct way is to include an ordinary DataFrame with a
set_index
call:In [12]: df = pd.DataFrame([[1, 2, 3], [1, 2, 6]], columns=['A', 'B', 'C']) In [13]: df = df.set_index(['A', 'B']) In [14]: df Out[14]: C A B 1 2 3 2 6
- Do provide insight to what it is when giving the outcome you want:
B A 1 1 5 0
Be specific about how you got the numbers (what are they)… double check they’re correct.
- If your code throws an error, do include the entire stack trace. This can be edited out later if it’s too noisy. Show the line number and the corresponding line of your code which it’s raising against.
- Pandas 2.0 introduced a number of changes, and Pandas 1.0 before that, so if you’re getting unexpected output, include the version:
pd.__version__
On that note, you might also want to include the version of Python, your OS, and any other libraries. You could use
pd.show_versions()
or thesession_info
package (which shows loaded libraries and Jupyter/IPython environment).
The Ugly:
- Don’t link to a CSV file we don’t have access to (and ideally don’t link to an external source at all).
df = pd.read_csv('my_secret_file.csv') # ideally with lots of parsing options
Most data is proprietary, we get that. Make up similar data and see if you can reproduce the problem (something small).
- Don’t explain the situation vaguely in words, like you have a DataFrame which is “large”, mention some of the column names in passing (be sure not to mention their dtypes). Try and go into lots of detail about something which is completely meaningless without seeing the actual context. Presumably no one is even going to read to the end of this paragraph.
Essays are bad; it’s easier with small examples.
- Don’t include 10+ (100+??) lines of data munging before getting to your actual question.
Please, we see enough of this in our day jobs. We want to help, but not like this…. Cut the intro, and just show the relevant DataFrames (or small versions of them) in the step which is causing you trouble.
There are different ways to delete an array element, where some are more useful for some specific tasks than others. Deleting a Single Array Element If you want to delete just one single array element you can use unset() and alternatively array_splice(). By key or by value? If you know the value andRead more
There are different ways to delete an array element, where some are more useful for some specific tasks than others.
Deleting a Single Array Element
If you want to delete just one single array element you can use
unset()
and alternativelyarray_splice()
.By key or by value?
If you know the value and don’t know the key to delete the element you can use
array_search()
to get the key. This only works if the element doesn’t occur more than once, sincearray_search()
returns the first hit only.unset()
ExpressionNote: When you use
unset()
the array keys won’t change. If you want to reindex the keys you can usearray_values()
afterunset()
, which will convert all keys to numerically enumerated keys starting from 0 (the array remains a list).Example Code:
Example Output:
array_splice()
FunctionIf you use
array_splice()
the (integer) keys will automatically be reindex-ed, but the associative (string) keys won’t change — as opposed toarray_values()
afterunset()
, which will convert all keys to numerical keys.Note:
array_splice()
needs the offset, not the key, as the second parameter; offset= array_flip(array_keys(
array))[
key]
.Example Code:
Example Output:
array_splice()
, same asunset()
, take the array by reference. You don’t assign the return values back to the array.Deleting Multiple Array Elements
If you want to delete multiple array elements and don’t want to call
unset()
orarray_splice()
multiple times you can use the functionsarray_diff()
orarray_diff_key()
depending on whether you know the values or the keys of the elements to remove from the array.array_diff()
FunctionIf you know the values of the array elements which you want to delete, then you can use
array_diff()
. As before withunset()
it won’t change the keys of the array.Example Code:
Example Output:
array_diff_key()
FunctionIf you know the keys of the elements which you want to delete, then you want to use
array_diff_key()
. You have to make sure you pass the keys as keys in the second parameter and not as values. Keys won’t reindex.Example Code:
Example Output:
If you want to use
unset()
orarray_splice()
to delete multiple elements with the same value you can usearray_keys()
to get all the keys for a specific value and then delete all elements.array_filter()
FunctionIf you want to delete all elements with a specific value in the array you can use
array_filter()
.Example Code:
Example Output:
See less