Home - WikiQuora

0

W3spoint99Begginer

Asked: December 26, 2024In: Python

How to make good reproducible pandas examples?

0

Having spent a decent amount of time watching both the r and pandas tags on SO, the impression that I get is that pandas questions are less likely to contain reproducible data. This is ...

Saralyn Begginer
Added an answer on December 26, 2024 at 2:02 pm
The Good: Do include a small example DataFrame, either as runnable code: In [1]: df = pd.DataFrame([[1, 2], [1, 3], [4, 6]], columns=['A', 'B']) or make it "copy and pasteable" using pd.read_clipboard(sep=r'\s\s+'). In [2]: df Out[2]: A B 0 1 2 1 1 3 2 4 6 Test it yourself to make sure it works andRead more

The Good:

Do include a small example DataFrame, either as runnable code:
In [1]: df = pd.DataFrame([[1, 2], [1, 3], [4, 6]], columns=['A', 'B'])

or make it “copy and pasteable” using pd.read_clipboard(sep=r'\s\s+').

In [2]: df Out[2]: A B 0 1 2 1 1 3 2 4 6

Test it yourself to make sure it works and reproduces the issue.

You can format the text for Stack Overflow by highlighting and using Ctrl+K (or prepend four spaces to each line), or place three backticks (“`) above and below your code with your code unindented.

I really do mean small. The vast majority of example DataFrames could be fewer than 6 rows,^{[citation needed]} and I bet I can do it in 5. Can you reproduce the error with df = df.head()? If not, fiddle around to see if you can make up a small DataFrame which exhibits the issue you are facing.
But every rule has an exception, the obvious one being for performance issues (in which case definitely use %timeit and possibly %prun to profile your code), where you should generate:

df = pd.DataFrame(np.random.randn(100000000, 10))

Consider using np.random.seed so we have the exact same frame. Having said that, “make this code fast for me” is not strictly on topic for the site.

For getting runnable code, df.to_dict is often useful, with the different orient options for different cases. In the example above, I could have grabbed the data and columns from df.to_dict('split').

Write out the outcome you desire (similarly to above)
In [3]: iwantthis Out[3]: A B 0 1 5 1 4 6

Explain where the numbers come from:

The 5 is the sum of the B column for the rows where A is 1.

Do show the code you’ve tried:
In [4]: df.groupby('A').sum() Out[4]: B A 1 5 4 6

But say what’s incorrect:

The A column is in the index rather than a column.

Do show you’ve done some research (search the documentation, search Stack Overflow), and give a summary:

The docstring for sum simply states “Compute sum of group values”

The groupby documentation doesn’t give any examples for this.

Aside: the answer here is to use df.groupby('A', as_index=False).sum().

If it’s relevant that you have Timestamp columns, e.g. you’re resampling or something, then be explicit and apply pd.to_datetime to them for good measure.
df['date'] = pd.to_datetime(df['date']) # this column ought to be date.

Sometimes this is the issue itself: they were strings.

The Bad:

Don’t include a MultiIndex, which we can’t copy and paste (see above). This is kind of a grievance with Pandas’ default display, but nonetheless annoying:
In [11]: df Out[11]: C A B 1 2 3 2 6

The correct way is to include an ordinary DataFrame with a set_index call:

In [12]: df = pd.DataFrame([[1, 2, 3], [1, 2, 6]], columns=['A', 'B', 'C']) In [13]: df = df.set_index(['A', 'B']) In [14]: df Out[14]: C A B 1 2 3 2 6

Do provide insight to what it is when giving the outcome you want:
B A 1 1 5 0

Be specific about how you got the numbers (what are they)… double check they’re correct.

If your code throws an error, do include the entire stack trace. This can be edited out later if it’s too noisy. Show the line number and the corresponding line of your code which it’s raising against.

Pandas 2.0 introduced a number of changes, and Pandas 1.0 before that, so if you’re getting unexpected output, include the version:
pd.__version__

On that note, you might also want to include the version of Python, your OS, and any other libraries. You could use pd.show_versions() or the session_info package (which shows loaded libraries and Jupyter/IPython environment).

The Ugly:

Don’t link to a CSV file we don’t have access to (and ideally don’t link to an external source at all).
df = pd.read_csv('my_secret_file.csv') # ideally with lots of parsing options

Most data is proprietary, we get that. Make up similar data and see if you can reproduce the problem (something small).

Don’t explain the situation vaguely in words, like you have a DataFrame which is “large”, mention some of the column names in passing (be sure not to mention their dtypes). Try and go into lots of detail about something which is completely meaningless without seeing the actual context. Presumably no one is even going to read to the end of this paragraph.
Essays are bad; it’s easier with small examples.

Don’t include 10+ (100+??) lines of data munging before getting to your actual question.
Please, we see enough of this in our day jobs. We want to help, but not like this…. Cut the intro, and just show the relevant DataFrames (or small versions of them) in the step which is causing you trouble.

See less
0

Share
Share

Share on Facebook

Share on Twitter

Share on LinkedIn

Share on WhatsApp

Report

0

W3spoint99Begginer

Asked: December 26, 2024In: Python

How Slicing in Python works?

0

How does Python’s slice notation (Slicing) work? That is: when I write code like a[x:y:z], a[:], a[::2] etc., how can I understand which elements end up in the slice?

Saralyn Begginer
Added an answer on December 26, 2024 at 1:59 pm
This answer was edited.
The syntax is: a[start:stop] # items start through stop-1 a[start:] # items start through the rest of the array a[:stop] # items from the beginning through stop-1 a[:] # a copy of the whole array There is also the step value, which can be used with any of the above: a[start:stop:step] # start througRead more

The syntax is:

a[start:stop] # items start through stop-1 a[start:] # items start through the rest of the array a[:stop] # items from the beginning through stop-1 a[:] # a copy of the whole array

There is also the step value, which can be used with any of the above:

a[start:stop:step] # start through not past stop, by step

The key point to remember is that the :stop value represents the first value that is not in the selected slice. So, the difference between stop and start is the number of elements selected (if step is 1, the default).

The other feature is that start or stop may be a negative number, which means it counts from the end of the array instead of the beginning. So:

a[-1] # last item in the array a[-2:] # last two items in the array a[:-2] # everything except the last two items

Similarly, step may be a negative number:

a[::-1] # all items in the array, reversed a[1::-1] # the first two items, reversed a[:-3:-1] # the last two items, reversed a[-3::-1] # everything except the last two items, reversed

Python is kind to the programmer if there are fewer items than you ask for. For example, if you ask for a[:-2] and a only contains one element, you get an empty list instead of an error. Sometimes you would prefer the error, so you have to be aware that this may happen.

Relationship with the slice object

A slice object can represent a slicing operation, i.e.:

a[start:stop:step]

is equivalent to:

a[slice(start, stop, step)]

Slice objects also behave slightly differently depending on the number of arguments, similar to range(), i.e. both slice(stop) and slice(start, stop[, step]) are supported. To skip specifying a given argument, one might use None, so that e.g. a[start:] is equivalent to a[slice(start, None)] or a[::-1] is equivalent to a[slice(None, None, -1)].

While the :-based notation is very helpful for simple slicing, the explicit use of slice() objects simplifies the programmatic generation of slicing.
See less
0

Share
Share

Share on Facebook

Share on Twitter

Share on LinkedIn

Share on WhatsApp

Report

0

W3spoint99Begginer

Asked: December 25, 2024In: MySQL

How to create pivot table in mysql?

0

If I have a MySQL table looking something like this: company_name action pagecount Company A PRINT 3 Company A PRINT 2 Company A PRINT 3 Company ...

Saralyn Begginer
Added an answer on December 25, 2024 at 10:21 am
Many people just use a tool like MSExcel, OpenOffice or other spreadsheet-tools for this purpose. This is a valid solution, just copy the data over there and use the tools the GUI offer to solve this. But... this wasn't the question, and it might even lead to some disadvantages, like how to get theRead more

Many people just use a tool like MSExcel, OpenOffice or other spreadsheet-tools for this purpose. This is a valid solution, just copy the data over there and use the tools the GUI offer to solve this.

But… this wasn’t the question, and it might even lead to some disadvantages, like how to get the data into the spreadsheet, problematic scaling and so on.

The SQL way…

Given his table looks something like this:

CREATE TABLE `test_pivot` ( `pid` bigint(20) NOT NULL AUTO_INCREMENT, `company_name` varchar(32) DEFAULT NULL, `action` varchar(16) DEFAULT NULL, `pagecount` bigint(20) DEFAULT NULL, PRIMARY KEY (`pid`) ) ENGINE=MyISAM;

Now look into his/her desired table:

company_name EMAIL PRINT 1 pages PRINT 2 pages PRINT 3 pages ------------------------------------------------------------- CompanyA 0 0 1 3 CompanyB 1 1 2 0

The rows (EMAIL, PRINT x pages) resemble conditions. The main grouping is by company_name.

In order to set up the conditions this rather shouts for using the CASE-statement. In order to group by something, well, use … GROUP BY.

The basic SQL providing this pivot can look something like this:

SELECT P.`company_name`, COUNT( CASE WHEN P.`action`='EMAIL' THEN 1 ELSE NULL END ) AS 'EMAIL', COUNT( CASE WHEN P.`action`='PRINT' AND P.`pagecount` = '1' THEN P.`pagecount` ELSE NULL END ) AS 'PRINT 1 pages', COUNT( CASE WHEN P.`action`='PRINT' AND P.`pagecount` = '2' THEN P.`pagecount` ELSE NULL END ) AS 'PRINT 2 pages', COUNT( CASE WHEN P.`action`='PRINT' AND P.`pagecount` = '3' THEN P.`pagecount` ELSE NULL END ) AS 'PRINT 3 pages' FROM test_pivot P GROUP BY P.`company_name`;

This should provide the desired result very fast. The major downside for this approach, the more rows you want in your pivot table, the more conditions you need to define in your SQL statement.

This can be dealt with, too, therefore people tend to use prepared statements, routines, counters and such.

Some additional links about this topic:

http://anothermysqldba.blogspot.de/2013/06/pivot-tables-example-in-mysql.html

http://www.codeproject.com/Articles/363339/Cross-Tabulation-Pivot-Tables-with-MySQL

http://datacharmer.org/downloads/pivot_tables_mysql_5.pdf

https://codingsight.com/pivot-tables-in-mysql/

See less
0

Share
Share

Share on Facebook

Share on Twitter

Share on LinkedIn

Share on WhatsApp

Report

0

W3spoint99Begginer

Asked: December 25, 2024In: Programmers

How to prevent SQL injection in PHP?

0

If user input is inserted without modification into an SQL query, then the application becomes vulnerable to SQL injection, like in the following example: $unsafe_variable = $_POST['user_input']; mysql_query("INSERT INTO `table` (`column`) VALUES ('$unsafe_variable')"); That’s because the user can input something ...

Saralyn Begginer
Added an answer on December 25, 2024 at 10:08 am
The correct way to avoid SQL injection attacks, no matter which database you use, is to separate the data from SQL, so that data stays data and will never be interpreted as commands by the SQL parser. It is possible to create an SQL statement with correctly formatted data parts, but if you don't fulRead more

The correct way to avoid SQL injection attacks, no matter which database you use, is to separate the data from SQL, so that data stays data and will never be interpreted as commands by the SQL parser. It is possible to create an SQL statement with correctly formatted data parts, but if you don’t fully understand the details, you should always use prepared statements and parameterized queries. These are SQL statements that are sent to and parsed by the database server separately from any parameters. This way it is impossible for an attacker to inject malicious SQL.

You basically have two options to achieve this:

Using PDO (for any supported database driver):
$stmt = $pdo->prepare('SELECT * FROM users WHERE name = :name'); $stmt->execute([ 'name' => $name ]); foreach ($stmt as $row) { // Do something with $row }

Using MySQLi (for MySQL):
Since PHP 8.2+ we can make use of execute_query() which prepares, binds parameters, and executes SQL statement in one method:

$result = $db->execute_query('SELECT * FROM users WHERE name = ?', [$name]); while ($row = $result->fetch_assoc()) { // Do something with $row }

Up to PHP8.1:

$stmt = $db->prepare('SELECT * FROM employees WHERE name = ?'); $stmt->bind_param('s', $name); // 's' specifies variable type 'string' $stmt->execute(); $result = $stmt->get_result(); while ($row = $result->fetch_assoc()) { // Do something with $row }

If you’re connecting to a database other than MySQL, there is a driver-specific second option that you can refer to (for example, pg_prepare() and pg_execute() for PostgreSQL). PDO is the universal option.

Correctly setting up the connection

PDO

Note that when using PDO to access a MySQL database real prepared statements are not used by default. To fix this you have to disable the emulation of prepared statements. An example of creating a connection using PDO is:

$dsn = 'mysql:dbname=dbtest;host=127.0.0.1;charset=utf8mb4'; $dbConnection = new PDO($dsn, 'user', 'password'); $dbConnection->setAttribute(PDO::ATTR_EMULATE_PREPARES, false); $dbConnection->setAttribute(PDO::ATTR_ERRMODE, PDO::ERRMODE_EXCEPTION);

In the above example, the error mode isn’t strictly necessary, but it is advised to add it. This way PDO will inform you of all MySQL errors by means of throwing the PDOException.

What is mandatory, however, is the first setAttribute() line, which tells PDO to disable emulated prepared statements and use real prepared statements. This makes sure the statement and the values aren’t parsed by PHP before sending it to the MySQL server (giving a possible attacker no chance to inject malicious SQL).

Although you can set the charset in the options of the constructor, it’s important to note that ‘older’ versions of PHP (before 5.3.6) silently ignored the charset parameter in the DSN.

Mysqli

For mysqli we have to follow the same routine:

mysqli_report(MYSQLI_REPORT_ERROR | MYSQLI_REPORT_STRICT); // error reporting $dbConnection = new mysqli('127.0.0.1', 'username', 'password', 'test'); $dbConnection->set_charset('utf8mb4'); // charset

Explanation

The SQL statement you pass to prepare is parsed and compiled by the database server. By specifying parameters (either a ? or a named parameter like :name in the example above) you tell the database engine where you want to filter on. Then when you call execute, the prepared statement is combined with the parameter values you specify.

The important thing here is that the parameter values are combined with the compiled statement, not an SQL string. SQL injection works by tricking the script into including malicious strings when it creates SQL to send to the database. So by sending the actual SQL separately from the parameters, you limit the risk of ending up with something you didn’t intend.

Any parameters you send when using a prepared statement will just be treated as strings (although the database engine may do some optimization so parameters may end up as numbers too, of course). In the example above, if the $name variable contains 'Sarah'; DELETE FROM employees the result would simply be a search for the string "'Sarah'; DELETE FROM employees", and you will not end up with an empty table.

Another benefit of using prepared statements is that if you execute the same statement many times in the same session it will only be parsed and compiled once, giving you some speed gains.

Oh, and since you asked about how to do it for an insert, here’s an example (using PDO):

$stmt = $db->prepare('INSERT INTO table (column) VALUES (:column)'); $stmt->execute(['column' => $value]);

Can prepared statements be used for dynamic queries?

While you can still use prepared statements for the query parameters, the structure of the dynamic query itself cannot be parametrized and certain query features cannot be parametrized.

For these specific scenarios, the best thing to do is use a whitelist filter that restricts the possible values.

// Value whitelist // $dir can only be 'DESC', otherwise it will be 'ASC' if (empty($dir) || $dir !== 'DESC') { $dir = 'ASC'; }

See less
0

Share
Share

Share on Facebook

Share on Twitter

Share on LinkedIn

Share on WhatsApp

Report

0

SaralynBegginer

Asked: December 24, 2024In: Programmers

What is a NullPointerException?

0

W3spoint99 Begginer
Added an answer on December 24, 2024 at 8:11 am
According To Java Docs: Thrown when an application attempts to use null in a case where an object is required. These include: Calling the instance method of a null object. Accessing or modifying the field of a null object. Taking the length of null as if it were an array. Accessing or modifying theRead more

According To Java Docs:

Thrown when an application attempts to use null in a case where an object is required. These include:

Calling the instance method of a null object.

Accessing or modifying the field of a null object.

Taking the length of null as if it were an array.

Accessing or modifying the slots of null as if it were an array.

Throwing null as if it were a Throwable value.

Applications should throw instances of this class to indicate other illegal uses of the null object.

It is also the case that if you attempt to use a null reference with synchronized, that will also throw this exception.

SynchronizedStatement: synchronized ( Expression ) Block

Otherwise, if the value of the Expression is null, a NullPointerException is thrown.

There are two overarching types of variables in Java:

Primitives: variables that contain data. If you want to manipulate the data in a primitive variable you can manipulate that variable directly. By convention primitive types start with a lowercase letter. For example variables of type int or char are primitives.

References: variables that contain the memory address of an Object i.e. variables that refer to an Object. If you want to manipulate the Object that a reference variable refers to you must dereference it. Dereferencing usually entails using . to access a method or field, or using [ to index an array. By convention reference types are usually denoted with a type that starts in uppercase. For example variables of type Object are references.

Consider the following code where you declare a variable of primitive type int and don’t initialize it:

int x; int y = x + x;

These two lines will crash the program because no value is specified for x and we are trying to use x‘s value to specify y. All primitives have to be initialized to a usable value before they are manipulated.

Now here is where things get interesting. Reference variables can be set to null which means “I am referencing nothing“. You can get a null value in a reference variable if you explicitly set it that way, or a reference variable is uninitialized and the compiler does not catch it (Java will automatically set the variable to null).

If a reference variable is set to null either explicitly by you or through Java automatically, and you attempt to dereference it you get a NullPointerException.

The NullPointerException (NPE) typically occurs when you declare a variable but did not create an object and assign it to the variable before trying to use the contents of the variable. So you have a reference to something that does not actually exist.

Take the following code:

Integer num; num = new Integer(10);

The first line declares a variable named num, but it does not actually contain a reference value yet. Since you have not yet said what to point to, Java sets it to null.

In the second line, the new keyword is used to instantiate (or create) an object of type Integer, and the reference variable num is assigned to that Integer object.

If you attempt to dereference num before creating the object you get a NullPointerException. In the most trivial cases, the compiler will catch the problem and let you know that “num may not have been initialized,” but sometimes you may write code that does not directly create the object.

For instance, you may have a method as follows:

public void doSomething(SomeObject obj) { // Do something to obj, assumes obj is not null obj.myMethod(); }

In which case, you are not creating the object obj, but rather assuming that it was created before the doSomething() method was called. Note, it is possible to call the method like this:

doSomething(null);

In which case, obj is null, and the statement obj.myMethod() will throw a NullPointerException.

If the method is intended to do something to the passed-in object as the above method does, it is appropriate to throw the NullPointerException because it’s a programmer error and the programmer will need that information for debugging purposes.

In addition to NullPointerExceptions thrown as a result of the method’s logic, you can also check the method arguments for null values and throw NPEs explicitly by adding something like the following near the beginning of a method:

// Throws an NPE with a custom error message if obj is null Objects.requireNonNull(obj, "obj must not be null");

Note that it’s helpful to say in your error message clearly which object cannot be null. The advantage of validating this is that 1) you can return your own clearer error messages and 2) for the rest of the method you know that unless obj is reassigned, it is not null and can be dereferenced safely.

Alternatively, there may be cases where the purpose of the method is not solely to operate on the passed in object, and therefore a null parameter may be acceptable. In this case, you would need to check for a null parameter and behave differently. You should also explain this in the documentation. For example, doSomething() could be written as:

public void doSomething(SomeObject obj) { if(obj == null) { // Do something } else { // Do something else } }

Now Java 14 has added a new language feature to show the root cause of NullPointerException. This language feature has been part of SAP commercial JVM since 2006.

In Java 14, the following is a sample NullPointerException Exception message:

in thread “main” java.lang.NullPointerException: Cannot invoke “java.util.List.size()” because “list” is null

List of situations that cause a NullPointerException to occur

Here are all the situations in which a NullPointerException occurs, that are directly* mentioned by the Java Language Specification:

Accessing (i.e. getting or setting) an instance field of a null reference. (static fields don’t count!)

Calling an instance method of a null reference. (static methods don’t count!)

throw null;

Accessing elements of a null array.

Synchronising on null – synchronized (someNullReference) { ... }

Any integer/floating point operator can throw a NullPointerException if one of its operands is a boxed null reference

An unboxing conversion throws a NullPointerException if the boxed value is null.

Calling super on a null reference throws a NullPointerException. If you are confused, this is talking about qualified superclass constructor invocations:

class Outer { class Inner {} } class ChildOfInner extends Outer.Inner { ChildOfInner(Outer o) { o.super(); // if o is null, NPE gets thrown } }

Using a for (element : iterable) loop to loop through a null collection/array.

switch (foo) { ... } (whether its an expression or statement) can throw a NullPointerException when foo is null.

foo.new SomeInnerClass() throws a NullPointerException when foo is null.

Method references of the form name1::name2 or primaryExpression::name throws a NullPointerException when evaluated when name1 or primaryExpression evaluates to null.

A note from the JLS here says that, someInstance.someStaticMethod() doesn’t throw an NPE, because someStaticMethod is static, but someInstance::someStaticMethod still throw an NPE!
See less
0

Share
Share

Share on Facebook

Share on Twitter

Share on LinkedIn

Share on WhatsApp

Report

The Good:

The Bad:

The Ugly:

Relationship with the `slice` object

Correctly setting up the connection

PDO

Mysqli

Explanation

Can prepared statements be used for dynamic queries?

There are two overarching types of variables in Java:

List of situations that cause a `NullPointerException` to occur

What is a NullPointerException?

How to prevent SQL injection in PHP?

How to create pivot table in mysql?

Sign Up

Sign In

Forgot Password

WikiQuora Latest Questions

The Good:

The Bad:

The Ugly:

Relationship with the slice object

Correctly setting up the connection

PDO

Mysqli

Explanation

Can prepared statements be used for dynamic queries?

There are two overarching types of variables in Java:

List of situations that cause a NullPointerException to occur

Relationship with the `slice` object

List of situations that cause a `NullPointerException` to occur